Greetings all,
I hope this message finds you in good spirits. I am trying to find a quick tutorial on the \b expression (apologies if there is a better term). I am writing a script at the moment to parse some xml files, but have ran into a bit of a speed bump. I will show an example of my xml:
<....></...><...></...><OrderId>123456</OrderId><...></...><CustomerId>44444444</CustomerId><...></...><...></...>
<...></...> is unimportant and non relevant xml code. Focus primarily on the CustomerID and OrderId.
My issue lies in parsing a string, similar to the above statement. I have a regexParse definition that works perfectly. However it is not intuitive. I need to match only the part of the string that contains <CustomerId>44444444</CustomerId>.
My Current setup is:
searchPattern = '>\d{8}</CustomerId'
Great! It works, but I want to do it the right way. My thinking is 1) find 8 digits 2) if the some word boundary is non numeric after that matches CustomerId return it.
Idea:
searchPattern = '\bd{16}\b'
My issue in my tests is incorporating the search for CustomerId somewhere before and after the digits. I was wondering if any of you can either help me out with my issue, or point me in the right path (in words of a guide or something along the lines). Any help is appreciated.
Mods if this is in the wrong forums apologies, I wanted to post this in the Python discussion because I am not sure if Python regex supports this functionality.
Thanks again all,
darcmasta
Python Regex Support
Page 1 of 12 Replies - 1193 Views - Last Post: 15 August 2012 - 06:27 PM
Replies To: Python Regex Support
#2
Re: Python Regex Support
Posted 14 August 2012 - 09:08 PM
This seems like it's purely a regular expression question... I'm thinking Computer Science forum.
EDIT:
I was thinking about changing your topic title accordingly, but I don't have the power to do that now that I've moved it! ack.
EDIT:
I was thinking about changing your topic title accordingly, but I don't have the power to do that now that I've moved it! ack.
This post has been edited by atraub: 15 August 2012 - 01:40 PM
#3
Re: Python Regex Support
Posted 15 August 2012 - 06:27 PM
If the data is in XML, why not parse the data as XML? Why are you trying to do this parsing using a regular expression?
Just like when everybody wants to try to parse HTML with a regular expression, we tell them, to use a real HTML parser, why are there different rules with regards to XML? If the XML DOM is too heavy weight, I hear that Python also supports SAX.
Just like when everybody wants to try to parse HTML with a regular expression, we tell them, to use a real HTML parser, why are there different rules with regards to XML? If the XML DOM is too heavy weight, I hear that Python also supports SAX.
Page 1 of 1
|
|

New Topic/Question
Reply



MultiQuote





|