|
|
RegExp pattern to escape ALL special characters (but exclude unicode chars)
Hi,
I'd like to write a regexp that converts all special chars to "-".
I've used this pattern
[^a-z0-9]
with ignore case, and it works beautifully.
BUT - I want to support also unicode chars (and not escape them).
I could not find a way to do it, except for listing all special
characters "manually" and escaping them. I'd rather prepare
"whitelist" - of the chars allowed, then a "blacklist" of all special
chars.
Any ideas?
Thanx,
Gabi
|
|
0
|
|
|
|
Reply
|
Gabriela
|
12/22/2008 3:39:53 PM |
|
Gabriela wrote:
> I'd like to write a regexp that converts all special chars to "-".
> I've used this pattern
> [^a-z0-9]
> with ignore case, and it works beautifully.
> BUT - I want to support also unicode chars (and not escape them).
> I could not find a way to do it, except for listing all special
> characters "manually" and escaping them. I'd rather prepare
> "whitelist" - of the chars allowed, then a "blacklist" of all special
> chars.
Well in a language where a string is a sequence of Unicode characters
any character is an Unicode character so I have no idea which kind of
characters you want to convert and which not.
--
Martin Honnen
http://JavaScript.FAQTs.com/
|
|
0
|
|
|
|
Reply
|
Martin
|
12/22/2008 3:57:09 PM
|
|
On Dec 22, 5:57 pm, Martin Honnen <mahotr...@yahoo.de> wrote:
> Gabriela wrote:
> > I'd like to write a regexp that converts all special chars to "-".
> > I've used this pattern
> > [^a-z0-9]
> > with ignore case, and it works beautifully.
> > BUT - I want to support also unicode chars (and not escape them).
> > I could not find a way to do it, except for listing all special
> > characters "manually" and escaping them. I'd rather prepare
> > "whitelist" - of the chars allowed, then a "blacklist" of all special
> > chars.
>
> Well in a language where a string is a sequence of Unicode characters
> any character is an Unicode character so I have no idea which kind of
> characters you want to convert and which not.
>
> --
>
> Martin Honnen
> http://JavaScript.FAQTs.com/
Isn't there a distinction between a special character (!@#$%^&*()_-
+..._) and all alphanumeric/literal characters?
|
|
0
|
|
|
|
Reply
|
Gabriela
|
12/22/2008 4:25:47 PM
|
|
Gabriela wrote:
> Isn't there a distinction between a special character (!@#$%^&*()_-
> +..._) and all alphanumeric/literal characters?
Maybe you are looking for letters and digits. Unicode defines classes
for that but the regular expression language in JavaScript/ECMAScript
does not have much support such constructs.
\d
is defined as 0..9, \D as anything which is not in \d.
\w
is defined as a..zA..Z0..9_, \W as anything which is not in \w.
Then there is \s for white space characters. And \S for anything not a
white space character.
Other than that you need to define your own ranges of characters.
--
Martin Honnen
http://JavaScript.FAQTs.com/
|
|
0
|
|
|
|
Reply
|
Martin
|
12/22/2008 4:36:01 PM
|
|
Gabriela wrote:
> Hi,
> I'd like to write a regexp that converts all special chars to "-".
> I've used this pattern
> [^a-z0-9]
> with ignore case, and it works beautifully.
> BUT - I want to support also unicode chars (and not escape them).
> I could not find a way to do it, except for listing all special
> characters "manually" and escaping them. I'd rather prepare
> "whitelist" - of the chars allowed, then a "blacklist" of all special
> chars.
> Any ideas?
> Thanx,
> Gabi
Just remember, it's better to deny all by default and then specifically
have a list (whitelist) of allowed characters, than to try and
specifically list all invalid characters.
--
Tim Greer, CEO/Founder/CTO, BurlyHost.com, Inc.
Shared Hosting, Reseller Hosting, Dedicated & Semi-Dedicated servers
and Custom Hosting. 24/7 support, 30 day guarantee, secure servers.
Industry's most experienced staff! -- Web Hosting With Muscle!
|
|
0
|
|
|
|
Reply
|
Tim
|
12/22/2008 6:04:19 PM
|
|
|
4 Replies
445 Views
(page loaded in 0.058 seconds)
|
|
|
|
|
|
|
|
|