Well, there's nothing casual about regex. I swear it's gotta be the most confounding thing I've ever encountered, and I've seen FLCL before!
So I'm studying it, and I guess I need a place to keep all of my favorite information sources, so here's that place.
My Favorite Sources of regex information:
http://rubular.com/
http://www.rubyist.n...uby/regexp.html
http://www.regular-e...o/examples.html
Examples:
http://www.dreaminco...3&#entry1360163
Get multiple things at the same time:
http://www.pastie.org/1468271
Get multiple things, plus some expert tips!
http://www.dreaminco...se-into-a-hash/
In my country, it is considered rude to create a blog entry with out at least some helpful, original information. Therefore I shall demonstrate the basics of using regex in a few of the languages I know:
C# (presently my favorite language)
Ruby (presently my unfavorite, but going to be my new favorite)
Let me explain that last one through and through. on the right side of the operator:
1) everything between the '/' characters is a part of a search expression.
2) The '*' makes it look for zero or more occurrences of the preceding character (space).
3) the '{' sign is just a bracket sign, like the word 'server' was just the word 'server'.
4) The '(...)' was there to say, "Hey! Capture three of any character (that's what dot stands for, any character). I'm not sure how to make use of this in C#, but with ruby it's really cool, and I'll elaborate after this.
5) The '.' meant to search for any... one... character
6) The '*' sign was paired with the dot sign, allowed us to search for ZERO or MORE occurances of the previously specified character type ('.' ie all)
7) The m argument at the end made sure that the '.' character worked for newline tokens as well as any other characters.
========================================
Ok, now thet that is all explained, I should tell you something a little more advanced. Consider that same string we used in the last example.
We could also use server\s*\{(.*?}) to get what we want, and capture the good stuff! By that I mean, capture everything regarding the 'location' declaration in the sample string (eg "location {etc.}").
That's cool, right. Let's try the same thing without the new questionmark symbol in our regex string.
BLAMO! Did you see that? The question mark was modifying the '.*' unit. Yikes, so the dot means, "ANY CHARACTER" and the * means "zero or more of DOT", and the ? means, "Don't be greedy, stop capturing as soon as possible. So when we ran the query without the question mark, it gobbled up the capture right past the first } and stopped at the second one. Cool distinction, defiantly something worth remembering.
So I'm studying it, and I guess I need a place to keep all of my favorite information sources, so here's that place.
My Favorite Sources of regex information:
http://rubular.com/
http://www.rubyist.n...uby/regexp.html
http://www.regular-e...o/examples.html
Examples:
http://www.dreaminco...3&#entry1360163
Get multiple things at the same time:
http://www.pastie.org/1468271
Get multiple things, plus some expert tips!
http://www.dreaminco...se-into-a-hash/
In my country, it is considered rude to create a blog entry with out at least some helpful, original information. Therefore I shall demonstrate the basics of using regex in a few of the languages I know:
C# (presently my favorite language)
string myText = "Oh god it's good to be back in c#. My IP address on this comp is 192.168.0.2. Wanna hear a joke? \nserver{ \nlocation { \nroot I'm right behind you!\n } \n} I didn't say it was a funny joke, I just said it was a joke!";
Match match = Regex.Match(myText, @"\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b",
RegexOptions.IgnoreCase);
MessageBox.Show("Found a match for an IP Address! " + match.ToString());
Ruby (presently my unfavorite, but going to be my new favorite)
my_text = "Oh god it's good to be back in c#. My IP address on this comp is 192.168.0.2. Wanna hear a joke? \nserver{ \nlocation { \nroot I'm right behind you!\n } \n} I didn't say it was a funny joke, I just said it was a joke!"
"hello world" =~ /world/ # returns 6, because index slot 6 is where 'world' was in the string.
# =~ is kind of like str.IndexOf(), but for regular expressions, not just 'find'
puts $~ # returns "world" because that was the last match caught by the =~ operator!
# Kind of like magic isn't it?
# The variable is thread local and method local, fyi.
my_text =~ /server *{(...).*}/m
$~.to_s # returns "server \nserver{ \nlocation { \nroot I'm right behind you!\n } \n}"
# Holy shit balls that's cool!
Let me explain that last one through and through. on the right side of the operator:
/server *{(...).*}/m
1) everything between the '/' characters is a part of a search expression.
/server *{(...).*}/m
2) The '*' makes it look for zero or more occurrences of the preceding character (space).
/server *{(...).*}/m
3) the '{' sign is just a bracket sign, like the word 'server' was just the word 'server'.
/server *{(...).*}/m
4) The '(...)' was there to say, "Hey! Capture three of any character (that's what dot stands for, any character). I'm not sure how to make use of this in C#, but with ruby it's really cool, and I'll elaborate after this.
/server *{(...).*}/m
5) The '.' meant to search for any... one... character
6) The '*' sign was paired with the dot sign, allowed us to search for ZERO or MORE occurances of the previously specified character type ('.' ie all)
/server *{(...).*}/m
7) The m argument at the end made sure that the '.' character worked for newline tokens as well as any other characters.
========================================
Ok, now thet that is all explained, I should tell you something a little more advanced. Consider that same string we used in the last example.
We could also use server\s*\{(.*?}) to get what we want, and capture the good stuff! By that I mean, capture everything regarding the 'location' declaration in the sample string (eg "location {etc.}").
my_text = "Oh god it's good to be back in c#. My IP address on this comp is 192.168.0.2. Wanna hear a joke? \nserver{ \nlocation { \nroot I'm right behind you!\n } \n} I didn't say it was a funny joke, I just said it was a joke!"
my_text =~ /server\s*\{(.*?})/m
$~.captures[0] # => "location { \nroot I'm right behind you!\n }"
That's cool, right. Let's try the same thing without the new questionmark symbol in our regex string.
my_text = "Oh god it's good to be back in c#. My IP address on this comp is 192.168.0.2. Wanna hear a joke? \nserver{ \nlocation { \nroot I'm right behind you!\n } \n} I didn't say it was a funny joke, I just said it was a joke!"
my_text =~ /server\s*\{(.*})/m
$~.captures[0] # => " \nlocation { \nroot I'm right behind you!\n } \n}"
BLAMO! Did you see that? The question mark was modifying the '.*' unit. Yikes, so the dot means, "ANY CHARACTER" and the * means "zero or more of DOT", and the ? means, "Don't be greedy, stop capturing as soon as possible. So when we ran the query without the question mark, it gobbled up the capture right past the first } and stopped at the second one. Cool distinction, defiantly something worth remembering.
1 Comments On This Entry
Page 1 of 1
NickDMax
11 June 2011 - 09:51 AM
One of the major requirements for an editor for me is to have regex search and replace. As a programmer I think it is a must. It is not uncommon to have some data in the form of a table or list that you need to convert into some usable format. One could save the data to a file and then write a program that reads the file and formats the data... or one could just use Search and replace with regex (or a tool like sed).
Page 1 of 1
Trackbacks for this entry [ Trackback URL ]
Tags
My Blog Links
Recent Entries
-
-
-
-
Regex sourceson Jun 08 2011 03:54 PM
-
Recent Comments
Search My Blog
0 user(s) viewing
0 Guests
0 member(s)
0 anonymous member(s)
0 member(s)
0 anonymous member(s)
Categories
|
|



1 Comments









|