4 Replies - 367 Views - Last Post: 19 June 2013 - 09:20 AM

#1 midasxl  Icon User is offline

  • D.I.C Head

Reputation: 2
  • View blog
  • Posts: 191
  • Joined: 03-December 08

What does this regex replace do?

Posted 19 June 2013 - 06:58 AM

Hello, and thanks for your time!

I have been trying to decipher this, and would greatly appreciate any explanation of what this is actually doing. I understand the caret inside square brackets is serving to negate the expression, but I don't understand what it is looking to NOT replace:

function removeBadCharacters(){
var searchString = document.forms[0].SearchString.value;
searchString = searchString.replace(/[^*-z, -%, ']+/g, "");
document.forms[0].SearchString.value = searchString;
}



This function is called on form submission with a single input text field named "SearchString".

Thanks!

Is This A Good Question/Topic? 0
  • +

Replies To: What does this regex replace do?

#2 BetaWar  Icon User is offline

  • #include "soul.h"
  • member icon

Reputation: 1138
  • View blog
  • Posts: 7,108
  • Joined: 07-September 06

Re: What does this regex replace do?

Posted 19 June 2013 - 08:11 AM

It looks like it gets rid of the following characters: {}~|()&
At least I haven't been able to find any other characters that will be matched by the pattern. Normally, when I have difficulty with a pattern I use regexpal.com, it can be quite useful. If we break the pattern down it matches anything through lower-case z (*-z), anything from ' ' to % ( -%), commas, and single-quotes. Then negates that, so anything but the previous. The pattern appears to be something you can simplify to this: [^*-z -%']

Hope that helps.
Was This Post Helpful? 1
  • +
  • -

#3 jon.kiparsky  Icon User is online

  • Pancakes!
  • member icon


Reputation: 7572
  • View blog
  • Posts: 12,717
  • Joined: 19-March 11

Re: What does this regex replace do?

Posted 19 June 2013 - 08:21 AM

searchString = "1234567890-=!@$#%^*&()_+[]\{}|;':\",./";
"1234567890-=!@$#%^*&()_+[]{}|;':",./"
searchString = searchString.replace(/[^*-z, -%, ']+/g, "");
"1234567890-=!@$#%^*_+[];':",./"



When in doubt, run it and see.

Looks like it's killing the ampersand, parens, and curly brackets.


De-negating it shows that I missed one, the pipe character:

searchString = "1234567890-=!@$#%^*&()_+[]\{}|;':\",./\\abcdefGHIJKL";
"1234567890-=!@$#%^*&()_+[]{}|;':",./\abcdefGHIJKL"
searchString = searchString.replace(/[*-z, -%, ']+/g, "");
"&(){}|"




Now if you look at an ASCII table, you'll see what's happening. This matches the negation of [every ASCII character from x2A ('*') to x7A ('z'), plus those from x20 (' ') through x25 ('%'), plus x27(the single-quote character, ')].

(EDIT: well, there's me beaten to the punch)

This post has been edited by jon.kiparsky: 19 June 2013 - 08:24 AM

Was This Post Helpful? 1
  • +
  • -

#4 midasxl  Icon User is offline

  • D.I.C Head

Reputation: 2
  • View blog
  • Posts: 191
  • Joined: 03-December 08

Re: What does this regex replace do?

Posted 19 June 2013 - 09:11 AM

Great information! Thanks!

A couple of things that were not evident to me:

I thought the * was not treated as a special character when it was inside square brackets, so I thought the * was a literal character. I didn't realize there was a space in the -% range. (As in 'space' TO %).

Thanks everyone, this really clears things up for me!

Cheers!
Was This Post Helpful? 0
  • +
  • -

#5 jon.kiparsky  Icon User is online

  • Pancakes!
  • member icon


Reputation: 7572
  • View blog
  • Posts: 12,717
  • Joined: 19-March 11

Re: What does this regex replace do?

Posted 19 June 2013 - 09:20 AM

View Postmidasxl, on 19 June 2013 - 11:11 AM, said:

I thought the * was not treated as a special character when it was inside square brackets, so I thought the * was a literal character.


Just to be clear, it is in fact being treated as a literal here. *-z represents the range from '*' through 'z', which starts at ASCII x2A and extends through x7A.

The most confusing thing about this, for me, is that the space character is a literal, not a separator. The sequence -% looked for all the world like a flag or a modifier to me, and that threw me off for a few minutes until my brain clicked over into Larry Wall's world and it all made sense.


Quote

Thanks everyone, this really clears things up for me!


Glad I could help - it's always interesting to review regex.

This post has been edited by jon.kiparsky: 19 June 2013 - 09:21 AM

Was This Post Helpful? 0
  • +
  • -

Page 1 of 1