Page 1 of 1

TR1 RegEx Library Using the new TR1 RegEx Library in C++ Rate Topic: -----

#1 ccubed  Icon User is offline

  • It's That Guy
  • member icon

Reputation: 162
  • View blog
  • Posts: 1,409
  • Joined: 13-June 08

Posted 06 December 2009 - 01:46 PM

Part 1: What is TR1?

TR1 is a new set of extensions to the C++ STL. So far, it seeks to integrate RegEx, Smart Pointers, Reference Wrappers, Tuples, Unordered Associative Arrays and many other implementations to help make the C++ Coder's life easier. If you're interested, you can find the Technical Draft at the link below.

http://www.open-std..../2005/n1836.pdf

While TR1 is currently only a technical draft, many compilers already have it included. GCC and Visual Studio 2008+ for example.

I'm writing this assuming that you already KNOW how to form Regular Expressions, if not, then please visit the link below and read up on it before reading the rest of this tutorial.

http://msdn.microsof...41x(VS.85).aspx

Or try this trusty link.

http://www.google.co...pression+Syntax

Remember: Google is our friend.

Part 2: What are we dealing with?

Six regular expression grammar types will be included in the new C++ Regular Expression library. There will be ECMA, POSIX and grammars from tools such as AWK, Grep, and EGrep. In this tutorial, we'll be talking about ECMA, which is the regular expression format that most people are familiar with and is used in Perl, Python or Ruby.

For an example of what kind of RegEx we'll be using, take this one.

'.*@mcm.*'



That would match any string with @mcm in it. The . refers to anything and the * means any number of instances of the previous character. So .* refers to any number of anything.

Part 3: Some Differences

There are a few differences between TR1's ECMA implementation and other languages implementation of it. It will be different from Python and Perl. If you know Python and Perl then reverse match and search.

Now, in TR1's Implementation, match will work ONLY if there is an EXACT match. That is, if you have a regular expressions worded like so.

string test = "This is";



Then you take that and use a match to match it against the Regular Expression is, it would return false. Why? Because there is text preceding or proceeding the match.

TR1's Search on the other hand does what we usually want. In the above example it returns true because it finds is in the string. It doesn't care if anything exists around 'is,' only that there is one or more instances of 'is.'

That being said, we're ready for some code.

Part 4: Basic Code

For all of the examples in this part we'll be using the following string as our base.

string text = "I'm a little pony, watch me gallop. My pink tail flutters, I'm a girl's dream."



Yes, I referenced My Little Pony. We all need some humor in these tutorials, seriously. Moving on, let's look at a simple match and search example.

Also, the include for this library is <regex> and it is part of the std::tr1 namespace.

For now we'll use the string 'pony' as our regular expression.

First a match.
regex test("pony");
regex_match(text.begin(),text.end(),test);



Note that I created a regular expression object with the first line called text. Then I assigned its regular expression to be 'pony.'

The next line calls the match function on the string text and uses test as its regular expression.

regex_match takes a varying number of arguments. In this case I provided a place inside text where to begin and a place where to end. You could have also used the following.

regex_match(text,test);



They both result in the same answer.

Now, the code returns a false because while pony is found in text, it has characters both proceeding and preceding it. Therefore, match returns false. A note about this: If you add .* to both sides it returns true since those will match the whole string.

Let's move on to Search.

regex test("pony");
regex_search(text,test);



This time I'll just use the simple form. In this case, everything is the same except that we changed match to search and that instead of using the begin,end,text way to access these functions I used the text,regex form. They both share the same form, just return different results.

In this case, it's true because pony is a substr of text.

Part 4: Getting us the matches

Okay, so we've covered finding matches, but we haven't covered getting them. To do this we use the typedef cmatch. Cmatch is a match_result object. match_result is a templated class. In this case, cmatch is type defined to const char*.

cmatch results;



results is now a collection of matches where results[n] corresponds to a certain match.

Now, let's use this in our code.

cmatch results;
regex test("a");
regex_search(text.c_str(),results,test);
cout << results[n] << endl;



So we take the results from the search and put out n, which we can assume to be 0 for now. This code results in the output 'a'. For iterating through results, all cmatch objects and match_object in general have the same functions as vectors. So length, max_size, begin, end, etc.

Part 5: Replacements

The last part, yeah. This time we'll talk about using regex to replace certain words in strings. This should be familiar to everyone.

string text = "I'm a sad person.";
string repl = "happy";
regex test("sad");
text = regex_replace(text,test,reply);
cout << text << endl;



Now, you might notice I had to declare my replacement as a string. I have to do that because for whatever reason, TR1 has decided that this should be invalid.

regex_replace(text,test,"happy");



Apparently, it needs a string object and not just a string literal thrown in there. I don't really know what the difference is, but we can't really do anything about it unless they change or someone in the community does. The result of this code is that it outputs this.

I'm a happy person.



Pretty self-explanatory. Note that regex_replace does return a string.

That's it, we've covered the regex TR1 library. If you want more in-depth information good luck, because only the MSDN currently has any useful information on it. Your best bet is the technical draft.

Is This A Good Question/Topic? 0
  • +

Page 1 of 1