5 Replies - 367 Views - Last Post: 19 May 2019 - 09:48 AM Rate Topic: -----

#1 RushabhVerma   User is offline

  • New D.I.C Head

Reputation: -2
  • View blog
  • Posts: 13
  • Joined: 23-April 19

How do I get the same result of Regex in C# as well as text editor?

Posted 16 May 2019 - 11:00 PM

I have a XML with the following data:
<navPoint id="navPoint-1" playOrder="1">
  <navLabel>
    <text>Cover</text>
  </navLabel>
  <content src="Text/01_Cover.xhtml"/>
</navPoint>
<navPoint id="navPoint-2" playOrder="2">
  <navLabel>
    <text>Titelblatt</text>
  </navLabel>
  <content src="Text/02_Titlepage.xhtml#Titlepage"/>
</navPoint>
<navPoint id="navPoint-3" playOrder="3">
  <navLabel>
    <text>Urheberrechte</text>
  </navLabel>
  <content src="Text/03_Copyright.xhtml#Copyright"/>
</navPoint>
<navPoint id="navPoint-4" playOrder="4">
  <navLabel>
    <text>Die S&#x00FC;nde Macht Den Menschen Menschlich, Die Liebe Macht Ihn G&#x00F6;ttlich</text>
  </navLabel>
  <content src="Text/04_FmChapter01.xhtml#FmChapter01"/>
</navPoint>
<navPoint id="navPoint-5" playOrder="5">
  <navLabel>
    <text>Vorwort</text>
  </navLabel>
  <content src="Text/05_Vorwort.xhtml#Vorwort"/>
</navPoint>
<navPoint id="navPoint-6" playOrder="6">
  <navLabel>
    <text>Ira</text>
  </navLabel>
  <content src="Text/06_Part01.xhtml#Part01"/>
  <navPoint id="navPoint-7" playOrder="7">
    <navLabel>
      <text>Zorn</text>
    </navLabel>
    <content src="Text/07_Chapter01.xhtml#Chapter01"/>
  </navPoint>
  <navPoint id="navPoint-8" playOrder="8">
    <navLabel>
      <text>Erfahrung J&#x00E4;hzorn</text>
    </navLabel>
    <content src="Text/08_Chapter02.xhtml#Chapter02"/>
  </navPoint>
</navPoint>
<navPoint id="navPoint-9" playOrder="9">
  <navLabel>
    <text>Luxuria</text>
  </navLabel>
  <content src="Text/09_Part02.xhtml#Part02"/>
  <navPoint id="navPoint-10" playOrder="10">
    <navLabel>
      <text>Wollust</text>
    </navLabel>
    <content src="Text/10_Chapter03.xhtml#Chapter03"/>
  </navPoint>
  <navPoint id="navPoint-11" playOrder="11">
    <navLabel>
      <text>Erfahrung Wollust</text>
    </navLabel>
    <content src="Text/11_Chapter04.xhtml#Chapter04"/>
  </navPoint>
</navPoint>
<navPoint id="navPoint-12" playOrder="12">
  <navLabel>
    <text>Avaritia</text>
  </navLabel>
  <content src="Text/12_Part03.xhtml#Part03"/>
  <navPoint id="navPoint-13" playOrder="13">
    <navLabel>
      <text>Geiz</text>
    </navLabel>
    <content src="Text/13_Chapter05.xhtml#Chapter05"/>
  </navPoint>
  <navPoint id="navPoint-14" playOrder="14">
    <navLabel>
      <text>Erfahrung Geiz</text>
    </navLabel>
    <content src="Text/14_Chapter06.xhtml#Chapter06"/>
  </navPoint>
</navPoint>
<navPoint id="navPoint-15" playOrder="15">
  <navLabel>
    <text>Ac&#x00E9;dia</text>
  </navLabel>
  <content src="Text/15_Part04.xhtml#Part04"/>
  <navPoint id="navPoint-16" playOrder="16">
    <navLabel>
      <text>Tr&#x00E4;gheit</text>
    </navLabel>
    <content src="Text/16_Chapter07.xhtml#Chapter07"/>
  </navPoint>
  <navPoint id="navPoint-17" playOrder="17">
    <navLabel>
      <text>Erfahrung Faulheit</text>
    </navLabel>
    <content src="Text/17_Chapter08.xhtml#Chapter08"/>
  </navPoint>
</navPoint>
<navPoint id="navPoint-18" playOrder="18">
  <navLabel>
    <text>Invidia</text>
  </navLabel>
  <content src="Text/18_Part05.xhtml#Part05"/>
  <navPoint id="navPoint-19" playOrder="19">
    <navLabel>
      <text>Neid</text>
    </navLabel>
    <content src="Text/19_Chapter09.xhtml#Chapter09"/>
  </navPoint>
  <navPoint id="navPoint-20" playOrder="20">
    <navLabel>
      <text>Erfahrung Neid</text>
    </navLabel>
    <content src="Text/20_Chapter10.xhtml#Chapter10"/>
  </navPoint>
</navPoint>
<navPoint id="navPoint-21" playOrder="21">
  <navLabel>
    <text>Gula</text>
  </navLabel>
  <content src="Text/21_Part06.xhtml#Part06"/>
  <navPoint id="navPoint-22" playOrder="22">
    <navLabel>
      <text>V&#x00F6;llerei</text>
    </navLabel>
    <content src="Text/22_Chapter11.xhtml#Chapter11"/>
  </navPoint>
  <navPoint id="navPoint-23" playOrder="23">
    <navLabel>
      <text>Achtung V&#x00F6;llerei</text>
    </navLabel>
    <content src="Text/23_Chapter12.xhtml#Chapter12"/>
  </navPoint>
  <navPoint id="navPoint-24" playOrder="24">
    <navLabel>
      <text>Erfahrung V&#x00F6;llerei</text>
    </navLabel>
    <content src="Text/24_Chapter13.xhtml#Chapter13"/>
  </navPoint>
</navPoint>
<navPoint id="navPoint-25" playOrder="25">
  <navLabel>
    <text>Superbia</text>
  </navLabel>
  <content src="Text/25_Part07.xhtml#Part07"/>
  <navPoint id="navPoint-26" playOrder="26">
    <navLabel>
      <text>Hochmut</text>
    </navLabel>
    <content src="Text/26_Chapter14.xhtml#Chapter14"/>
  </navPoint>
  <navPoint id="navPoint-27" playOrder="27">
    <navLabel>
      <text>Erfahrung Hochmut</text>
    </navLabel>
    <content src="Text/27_Chapter15.xhtml#Chapter15"/>
  </navPoint>
</navPoint>
<navPoint id="navPoint-28" playOrder="28">
  <navLabel>
    <text>Literatur Zu Den 7 Tods&#x00FC;nden</text>
  </navLabel>
  <content src="Text/28_Literatur.xhtml#Literatur"/>
</navPoint>
<navPoint id="navPoint-29" playOrder="29">
  <navLabel>
    <text>Inhalt</text>
  </navLabel>
  <content src="Text/29_Contents.xhtml#Contents"/>
</navPoint>



I need the following output:

<li id="NavPoint-#">
  <a href="Text/01_Cover.xhtml">Cover</a>
</li>
<li id="NavPoint-#">
  <a href="Text/02_Titlepage.xhtml#Titlepage">Titelblatt</a>
</li>
<li id="NavPoint-#">
  <a href="Text/03_Copyright.xhtml#Copyright">Urheberrechte</a>
</li>
<li id="NavPoint-#">
  <a href="Text/04_FmChapter01.xhtml#FmChapter01">Die S&#x00FC;nde Macht Den Menschen Menschlich, Die Liebe Macht Ihn G&#x00F6;ttlich</a>
</li>
<li id="NavPoint-#">
  <a href="Text/05_Vorwort.xhtml#Vorwort">Vorwort</a>
</li>
<li id="NavPoint-#">
  <a href="Text/06_Part01.xhtml#Part01">Ira</a>
  <ol>
    <li id="NavPoint-#">
      <a href="Text/07_Chapter01.xhtml#Chapter01">Zorn</a>
    </li>
    <li id="NavPoint-#">
      <a href="Text/08_Chapter02.xhtml#Chapter02">Erfahrung J&#x00E4;hzorn</a>
    </li>
  </ol>
</li>
<li id="NavPoint-#">
  <a href="Text/09_Part02.xhtml#Part02">Luxuria</a>
  <ol>
    <li id="NavPoint-#">
      <a href="Text/10_Chapter03.xhtml#Chapter03">Wollust</a>
    </li>
    <li id="NavPoint-#">
      <a href="Text/11_Chapter04.xhtml#Chapter04">Erfahrung Wollust</a>
    </li>
  </ol>
</li>
<li id="NavPoint-#">
  <a href="Text/12_Part03.xhtml#Part03">Avaritia</a>
  <ol>
    <li id="NavPoint-#">
      <a href="Text/13_Chapter05.xhtml#Chapter05">Geiz</a>
    </li>
    <li id="NavPoint-#">
      <a href="Text/14_Chapter06.xhtml#Chapter06">Erfahrung Geiz</a>
    </li>
  </ol>
</li>
<li id="NavPoint-#">
  <a href="Text/15_Part04.xhtml#Part04">Ac&#x00E9;dia</a>
  <ol>
    <li id="NavPoint-#">
      <a href="Text/16_Chapter07.xhtml#Chapter07">Tr&#x00E4;gheit</a>
    </li>
    <li id="NavPoint-#">
      <a href="Text/17_Chapter08.xhtml#Chapter08">Erfahrung Faulheit</a>
    </li>
  </ol>
</li>
<li id="NavPoint-#">
  <a href="Text/18_Part05.xhtml#Part05">Invidia</a>
  <ol>
    <li id="NavPoint-#">
      <a href="Text/19_Chapter09.xhtml#Chapter09">Neid</a>
    </li>
    <li id="NavPoint-#">
      <a href="Text/20_Chapter10.xhtml#Chapter10">Erfahrung Neid</a>
    </li>
  </ol>
</li>
<li id="NavPoint-#">
  <a href="Text/21_Part06.xhtml#Part06">Gula</a>
  <ol>
    <li id="NavPoint-#">
      <a href="Text/22_Chapter11.xhtml#Chapter11">V&#x00F6;llerei</a>
    </li>
    <li id="NavPoint-#">
      <a href="Text/23_Chapter12.xhtml#Chapter12">Achtung V&#x00F6;llerei</a>
    </li>
    <li id="NavPoint-#">
      <a href="Text/24_Chapter13.xhtml#Chapter13">Erfahrung V&#x00F6;llerei</a>
    </li>
  </ol>
</li>
<li id="NavPoint-#">
  <a href="Text/25_Part07.xhtml#Part07">Superbia</a>
  <ol>
    <li id="NavPoint-#">
      <a href="Text/26_Chapter14.xhtml#Chapter14">Hochmut</a>
    </li>
    <li id="NavPoint-#">
      <a href="Text/27_Chapter15.xhtml#Chapter15">Erfahrung Hochmut</a>
    </li>
  </ol>
</li>
<li id="NavPoint-#">
  <a href="Text/28_Literatur.xhtml#Literatur">Literatur Zu Den 7 Tods&#x00FC;nden</a>
</li>
<li id="NavPoint-#">
  <a href="Text/29_Contents.xhtml#Contents">Inhalt</a>
</li>


I am getting the desired output when using Regex replacements in notepad, regex101 and other text editors. But in C#, I am getting a different output. I am unable to figure out the issue. Is something wrong with C# regex?


I am using the following regex replaments:

Editor Regex:

Patter 1: "<navPoint id="navPoi[^"]+" playOrder="[^"]+"><navLabel><text>([^<>\r\n]+)</text></navLabel><content src="([^<>\r\n]+)"/></navPoint>"
Substitution 1: "<li id="NavPoint-#"><a href="$2">$1</a></li>"

Patter 2: "<navPoint id="navPoi[^"]+" playOrder="[^"]+"><navLabel><text>([^<>\r\n]+)</text></navLabel><content src="([^<>\r\n]+)"/>$"
Substitution 2: "<li id="NavPoint-#"><a href="$2">$1</a>\r\n<ol>"

Pattern 3: "</navPoint>"
Substitution 3: "</ol></li>"


C# Regex:

string firstPattern = @"<navPoint id=""navPoi[^""]+"" playOrder=""[^""]+""><navLabel><text>(.+)<\/text><\/navLabel><content src=""([^<>\r?\n]+)""\/><\/navPoint>";
string firstSubstitution = @"<li id=""NavPoint-#""><a href=""$2"">$1</a></li>";
RegexOptions options = RegexOptions.Multiline;
Regex firstRegex = new Regex(firstPattern, options);
string newNavMap = firstRegex.Replace(navMapValues, firstSubstitution);
string secondPattern = @"<navPoint id=""navPoi[^""]+"" playOrder=""[^""]+""><navLabel><text>(.+)<\/text><\/navLabel><content src=""([^<>\r?\n]+)""\/>";
string secondSubstitution = @"<li id=""NavPoint-#""><a href=""$2"">$1</a>" + Environment.NewLine + "<ol>";
Regex secondRegex = new Regex(secondPattern, options);
string anotherNavMap = secondRegex.Replace(newNavMap, secondSubstitution);
string thirdPattern = @"</navPoint>";
string thirdSubstitution = @"</ol></li>";
Regex thirdRegex = new Regex(thirdPattern, options);
string finalNavMap = thirdRegex.Replace(anotherNavMap, thirdSubstitution);
finalNavMap = finalNavMap.Replace("\r\n</ol></li>", "</ol></li>");

This post has been edited by astonecipher: 17 May 2019 - 07:09 AM
Reason for edit:: Fixed things.


Is This A Good Question/Topic? 0
  • +

Replies To: How do I get the same result of Regex in C# as well as text editor?

#2 Skydiver   User is offline

  • Code herder
  • member icon

Reputation: 6967
  • View blog
  • Posts: 23,676
  • Joined: 05-May 12

Re: How do I get the same result of Regex in C# as well as text editor?

Posted 17 May 2019 - 04:01 AM

In general, using RegEx's to manipulate HTML or XML is a poor idea unless you are doing very simply things. You should be using the appropriate tool for the job. In this case, since it is XML you should be using XSLT. If the learning curve for XSLT is too steep, do the manipulations using the XML DOM (document object model) by loading the XML into a XDoc or XmlDocument.
Was This Post Helpful? 1
  • +
  • -

#3 Sheepings   User is offline

  • D.I.C Lover
  • member icon

Reputation: 224
  • View blog
  • Posts: 1,260
  • Joined: 05-December 13

Re: How do I get the same result of Regex in C# as well as text editor?

Posted 17 May 2019 - 06:50 AM

Skydiver would you mind editing OP post and remove the (3 times) XML duplication and enclose his code in tags please?

OP please post your code in code tags. It is unnecessary to create work for moderators if you would follow the rules regarding posting criteria.
[code] code here [ /code]

I also agree with Skydiver, I use regex a lot and this is one of those times where you wouldn't use it. Instead I'd probably use Xdocument. Once your post is cleared up, I'll take a look for you.
Was This Post Helpful? 1
  • +
  • -

#4 Sheepings   User is offline

  • D.I.C Lover
  • member icon

Reputation: 224
  • View blog
  • Posts: 1,260
  • Joined: 05-December 13

Re: How do I get the same result of Regex in C# as well as text editor?

Posted 17 May 2019 - 09:06 AM

Your Xml is not compliant and contains multiple roots and errors. Is it safe to assume this is why you're using regex over the appropriate provider classes? Lol Where did you get your pattens from? Good god, lets step back. Is it possible to fix/structure your Xml doc first? That would be where I would start. Prevention is better than cure. And if still needed, I can guide you on replacing the files bits n'bobs
Was This Post Helpful? 0
  • +
  • -

#5 Skydiver   User is offline

  • Code herder
  • member icon

Reputation: 6967
  • View blog
  • Posts: 23,676
  • Joined: 05-May 12

Re: How do I get the same result of Regex in C# as well as text editor?

Posted 18 May 2019 - 08:34 PM

View PostRushabhVerma, on 17 May 2019 - 02:00 AM, said:

I am getting the desired output when using Regex replacements in notepad, regex101 and other text editors. But in C#, I am getting a different output. I am unable to figure out the issue. Is something wrong with C# regex?

As far as I know, Notepad doesn't have regular expressions support for Find and Replace. Are you sure you are using Notepad?

Regex101 doesn't support .NET Framework regular expressions. Did you try using regexstorm instead?
Was This Post Helpful? 0
  • +
  • -

#6 Martyr2   User is offline

  • Programming Theoretician
  • member icon

Reputation: 5417
  • View blog
  • Posts: 14,328
  • Joined: 18-April 07

Re: How do I get the same result of Regex in C# as well as text editor?

Posted 19 May 2019 - 09:48 AM

I have to wholeheartedly agree with Skydiver on the XSLT or loading through a DOM object. Regex is just not going to cut it later down the road and a terrible maintenance nightmare.

This post has been edited by Martyr2: 19 May 2019 - 09:48 AM

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1