2 Replies - 1317 Views - Last Post: 15 August 2012 - 06:27 PM

#1 darcmasta  Icon User is offline

  • New D.I.C Head

Reputation: 4
  • View blog
  • Posts: 20
  • Joined: 27-September 11

Python Regex Support

Posted 14 August 2012 - 08:01 PM

Greetings all,

I hope this message finds you in good spirits. I am trying to find a quick tutorial on the \b expression (apologies if there is a better term). I am writing a script at the moment to parse some xml files, but have ran into a bit of a speed bump. I will show an example of my xml:

<....></...><...></...><OrderId>123456</OrderId><...></...><CustomerId>44444444</CustomerId><...></...><...></...>

<...></...> is unimportant and non relevant xml code. Focus primarily on the CustomerID and OrderId.

My issue lies in parsing a string, similar to the above statement. I have a regexParse definition that works perfectly. However it is not intuitive. I need to match only the part of the string that contains <CustomerId>44444444</CustomerId>.

My Current setup is:
searchPattern = '>\d{8}</CustomerId'

Great! It works, but I want to do it the right way. My thinking is 1) find 8 digits 2) if the some word boundary is non numeric after that matches CustomerId return it.

Idea:
searchPattern = '\bd{16}\b'

My issue in my tests is incorporating the search for CustomerId somewhere before and after the digits. I was wondering if any of you can either help me out with my issue, or point me in the right path (in words of a guide or something along the lines). Any help is appreciated.

Mods if this is in the wrong forums apologies, I wanted to post this in the Python discussion because I am not sure if Python regex supports this functionality.

Thanks again all,

darcmasta

Is This A Good Question/Topic? 0
  • +

Replies To: Python Regex Support

#2 atraub  Icon User is offline

  • Pythoneer
  • member icon

Reputation: 759
  • View blog
  • Posts: 2,010
  • Joined: 23-December 08

Re: Python Regex Support

Posted 14 August 2012 - 09:08 PM

This seems like it's purely a regular expression question... I'm thinking Computer Science forum.

EDIT:
I was thinking about changing your topic title accordingly, but I don't have the power to do that now that I've moved it! ack.

This post has been edited by atraub: 15 August 2012 - 01:40 PM

Was This Post Helpful? 0
  • +
  • -

#3 Skydiver  Icon User is online

  • Code herder
  • member icon

Reputation: 3590
  • View blog
  • Posts: 11,166
  • Joined: 05-May 12

Re: Python Regex Support

Posted 15 August 2012 - 06:27 PM

If the data is in XML, why not parse the data as XML? Why are you trying to do this parsing using a regular expression?

Just like when everybody wants to try to parse HTML with a regular expression, we tell them, to use a real HTML parser, why are there different rules with regards to XML? If the XML DOM is too heavy weight, I hear that Python also supports SAX.
Was This Post Helpful? 1
  • +
  • -

Page 1 of 1