f



regex with accents

Hi,

I can't get the characters with accents in a regex. This is my code :
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
var MyText1 = "�l�phant1" ;
var MyText2 = "elephant1" ;
var MyReg = /^[\w]+$/ ;

if(MyReg.test(MyText1))
    alert(MyText1 + " is OK") ;
else
    alert(MyText1 + " is not valid") ;


if(MyReg.test(MyText2))
    alert(MyText2 + " is OK") ;
else
    alert(MyText2 + " is not valid") ;
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Here's what I get :
�l�phant1 is not valid
elephant1 is OK

I'd like �l�phant1 to be OK, but I can't.
Can you help me ?

Thanks in advance,

Albert 


0
albert
9/22/2007 1:24:03 PM
comp.lang.javascript 38370 articles. 0 followers. javascript4 (1315) is leader. Post Follow

9 Replies
14095 Views

Similar Articles

[PageSpeed] 49

albert wrote:
> I can't get the characters with accents in a regex. This is my code :
>  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> var MyText1 = "�l�phant1" ;
> var MyText2 = "elephant1" ;
> var MyReg = /^[\w]+$/ ;

>  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> Here's what I get :
> �l�phant1 is not valid
> elephant1 is OK
> 
> I'd like �l�phant1 to be OK, but I can't.
> Can you help me ?

ECMA262 15.10.2.12 defines \w as being equivalent to the character class 
[0-1A-za-z_]. The w suggests word, but that is deceptive. Support for 
internationalization in JavaScript's RegExp is virtually nonexistent.

You need to define your own character class.

http://javascript.crockford.com/
2
Douglas
9/22/2007 1:44:18 PM
> ECMA262 15.10.2.12 defines \w as being equivalent to the character class 
> [0-1A-za-z_]. The w suggests word, but that is deceptive. Support for 
> internationalization in JavaScript's RegExp is virtually nonexistent.
>
> You need to define your own character class.

How can I do so ?


albert 


0
albert
9/22/2007 3:54:31 PM
albert wrote on 22 sep 2007 in comp.lang.javascript:

>> ECMA262 15.10.2.12 defines \w as being equivalent to the character
>> class [0-1A-za-z_]. The w suggests word, but that is deceptive.
>> Support for internationalization in JavaScript's RegExp is virtually
>> nonexistent. 
>>
>> You need to define your own character class.
> 
> How can I do so ?

var MyReg = /^[\w������i�������]+$/i;

Depending on your local requirements.



-- 
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
0
Evertjan
9/22/2007 4:01:44 PM
> var MyReg = /^[\w������i�������]+$/i;
>
> Depending on your local requirements.
>
> -- 
> Evertjan.
> The Netherlands.
> (Please change the x'es to dots in my emailaddress)

I've got french... that's no pb.
But I also have arabic & hebrew, this is more difficult.


albert 


0
albert
9/22/2007 4:17:15 PM
albert wrote on 22 sep 2007 in comp.lang.javascript:

>> var MyReg = /^[\w������i�������]+$/i;
>>
>> Depending on your local requirements.

[please do not quote signatures on usenet. removed]

> 
> I've got french... that's no pb.

pb? [please no sms-language on usenet]

> But I also have arabic & hebrew, this is more difficult.

Why should it be easy?

Javascript accommodates unicode.

-- 
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
0
Evertjan
9/22/2007 6:01:05 PM
In comp.lang.javascript message <S_8Ji.28213$eY.19207@newssvr13.news.pro
digy.net>, Sat, 22 Sep 2007 13:44:18, Douglas Crockford
<nospam@sbcglobal.net> posted:
>
>ECMA262 15.10.2.12 defines \w as being equivalent to the character
>class [0-1A-za-z_]. The w suggests word, but that is deceptive. Support
>for internationalization in JavaScript's RegExp is virtually
>nonexistent.

<URL:http://www.merlyn.demon.co.uk/humourous.htm#FredHoyle> advises <G>
:-
        Fred Hoyle (1915-2001) :-
        "'Dam’ good idea. Always force foreigner to learn English.'"
        Alexis Ivan Alexandrov, in "The Black Cloud", Chap. 10, para 4.

-- 
 (c) John Stockton, Surrey, UK.  ?@merlyn.demon.co.uk   Turnpike v6.05   MIME.
 Web  <URL:http://www.merlyn.demon.co.uk/> - FAQqish topics, acronyms & links;
  Astro stuff via astron-1.htm, gravity0.htm ; quotings.htm, pascal.htm, etc.
 No Encoding. Quotes before replies. Snip well. Write clearly. Don't Mail News.
0
Dr
9/22/2007 9:31:02 PM
>> I've got french... that's no pb.
>
> pb? [please no sms-language on usenet]

pb = problem (sorry, I thought it was obvious).

>
>> But I also have arabic & hebrew, this is more difficult.
>
> Why should it be easy?

I've never said it should be easy. Don't waste time to answer here...

>
> Javascript accommodates unicode.
>

Well I tried a simple word in Arabic with the following regex :

^[\w]+$

still, the "test" function always returned false. Do you have any good 
working example about it ?


thx, oops, soory I meant "Thanks" ;-)


albert 


0
albert
9/23/2007 7:09:20 AM
albert wrote on 23 sep 2007 in comp.lang.javascript:

>>> I've got french... that's no pb.
>>
>> pb? [please no sms-language on usenet]
> 
> pb = problem (sorry, I thought it was obvious).

Not to me. Usenet has it's own limited set of abbreviations.
If any Pb perhaps would be lead.

>>> But I also have arabic & hebrew, this is more difficult.
>>
>> Why should it be easy?
> 
> I've never said it should be easy. Don't waste time to answer here...

You are the OP, so ...

>> Javascript accommodates unicode.
>>
> 
> Well I tried a simple word in Arabic with the following regex :
> 
> ^[\w]+$

Would you allow for figures 0-9?
Otherwise this is better for simple Latin chars:

/^[a-z]+$/i

> still, the "test" function always returned false. 

I showed you how to do that with accents, 
did you understand the regex?

Why would Arabic characters match 
where accented characters do not?

> Do you have any good 
> working example about it ?

I am not into working examples, but will gve you a hint.

Arabic should work the same as accented ones:

/^[a-z\u0600-\u06ff]+$/

[http://unicode.org/charts/PDF/U0600.pdf]

Not knowing Arabic I cannot test that.
 
> thx, oops, soory I meant "Thanks" ;-)

-- 
Evertjan.
The Netherlands.
(Please change the x'es to dots in my emailaddress)
0
Evertjan
9/23/2007 7:30:57 AM
> You are the OP, so ...

Now it's my turn :-)
What does OP mean ?

>>
>> Well I tried a simple word in Arabic with the following regex :
>>
>> ^[\w]+$
>
> Would you allow for figures 0-9?

Yes

> Otherwise this is better for simple Latin chars:
>
> /^[a-z]+$/i
>
>> still, the "test" function always returned false.
>
> I showed you how to do that with accents,
> did you understand the regex?

Yes

>
> Why would Arabic characters match
> where accented characters do not?

You're right.

>
>> Do you have any good
>> working example about it ?
>
> I am not into working examples, but will gve you a hint.
>
> Arabic should work the same as accented ones:
>
> /^[a-z\u0600-\u06ff]+$/
>
> [http://unicode.org/charts/PDF/U0600.pdf]
>
> Not knowing Arabic I cannot test that.

I tested. It works :-)

Thank you for your help !


albert 


0
albert
9/24/2007 1:13:29 PM
Reply:

Similar Artilces:

regEx in Javascript
Hi, I'm new to javascript and regEx and trying to solve the following problem. I have a function which validates the password if there is a number: ------------------------------------------------- function findNumeric(str_obj){ regEx = /\d/; if (str_obj.match(regEx)) return true; else return false; } -------------------------------------------------- The problem arises when I put a password with a space in between e.g: 'test test1'. The fucntion returns false. I've tried '\s' in the regEx but the user can put the space anywhere.. Any idea how to sol...

regex and Javascript
Hi - I've never tried using regular expressions with JS. Can someone help me with a simple one? I have a field that a user inputs that looks for an integer, or a some integers with a decimal then does some calculations and converts it to currency. Problem is, that if i have to recalculate, that field now has a "$" in it which throws off calculations. I'd like to look at the field and if there is a "$", strip it out, do the calculations, etc. So, something like: function getDollar() { myAmount = window.document.form.amount.value; //here is where i want to look f...

regex in javascript
Hi folks, I'm trying to learn and understand regex in javascript for web form validation. In trying to write a piece of script to validate an email address I came up with the following: function isEmail(input) { var re =/^\w([\.\-]\w)*@\w([\.\-]\w)+$/ if (!re.exec(input)) { return false } } This is then referred to as : if (!isEmail(document.form.mailfrom.value)) { alert("This does not look like an email address") } etc. Unfortunately it bombs on absolutely everything. Can anyone suggest why? As an aside, is there any trul...

JavaScript Regex
Dear JavaScripters: I have not found quite the detail that I need for regexes. If you have a URL to a good site, please post it. In particular, how do I group subexpressions? I wish to parse a string to see if it is a valid fixed-decimal value. I start with /^ How do I check a set of characters optionally? I want to check for a sign character. The string might start with "+" or "-", but if it starts with neither, that is fine. Do I leave an empty choice in a parens set? Is it then ([+-]|) or [+-]? ? Then come...

RegEx help in Javascript?
I have a variable named "acct". I first want to remove any "-" characters from it's value. After this I want to verify that we have only exactly 12 digits in the variable. Unfortunately I'm pretty green as far as using RegEx. /\d{12}/.test(acct); should do the second part, but how do I do the first? "Noozer" <dont.spam@me.here> wrote in message news:o0tye.1867503$Xk.369504@pd7tw3no... > I have a variable named "acct". I first want to remove any "-" characters > from it's value. After this I want to verify that we have only exactly 12 > digits in the variable. > > Unfortunately I'm pretty green as far as using RegEx. > > /\d{12}/.test(acct); should do the second part, but how do I do the first? Actually figured out what I needed: f.Acct.value = f.Acct.value.replace(/^(\d{4})-?(\d{4})-?(\d{4})$/,'$1$2$3'); BUT, now I need to go the other way... Assuming that my variable acct contains "123456789012" what expression do I use to get "1234-5678-9012" ??? Noozer wrote: > I have a variable named "acct". I first want to remove any "-" characters > from it's value. yourVar = yourVar.replace(/-/g,""); Daniel Noozer wrote: > Assuming that my variable acct contains "123456789012" what expression do I > use to get "1234-5678-9012" replace(/^(\d{4})(\d{4})(\d{4})$/,"$1-$2-$3&qu...

Javascript RegEx problem
I am working on modifying a syntax highlighter written in javascript and it uses several regexes. I need to add a language to the avail highlighters and need the following regexes modified to parse the new language, Delphi/Pascal. Source to the highlighter is avail here: http://www.dreamprojections.com/SyntaxHighlighter/Default.aspx ********************************************** COMMENTS ********************************************** regex = new RegExp('//.*$|/\\*[^\\*]*(.)*?\\*/', 'gm'); Matches- single line comment: '// ' until end of line multi lin...

Javascript regex lookaheads
I have a regular expression that makes sure an input string is a decimal value, the values of which are limited to 0.50 increments. For example, 10.00, 31.5 and 0.5000000 would all be valid. ^[0-9]{1,}(\.[0]{0,}|\.5[0]{0,})?$ I had initially planned on using lookaheads [e.g. x(?=y)] to make sure that a decimal should only be recognized if a value comes after it. I was trying to do something like this (initially just to match a decimal value with 2 zeros after it): ^[0-9]{1,}\.(?=00)$ But even that didn't work. What am I missing? Thanks in advance.......... -=Tek Boy=- ...

RCR: regex + regex
Feedback on the following suggestion for ruby: by default allow for adding regex's i.e. >> /foo/ + /bar/ => /foobar/ Thoughts? -r -- Posted via http://www.ruby-forum.com/. On Oct 26, 2009, at 5:13 PM, Roger Pack wrote: > Feedback on the following suggestion for ruby: > by default allow for adding regex's > > i.e. >>> /foo/ + /bar/ > => /foobar/ > > Thoughts? > -r > -- irb> a=/foo/ => /foo/ irb> b=/bar/ => /bar/ irb> class Regexp irb> def +(other) irb> self.class.new(self.to_s + other.to_s) irb> end irb> end => nil irb> a+b => /(?-mix:foo)(?-mix:bar)/ This is obviously too naive an implementation, but if we change a to / foo/i then I'd expect "Foobar" =~ (a+b) to be true (well, I mean 0, of course) and "fooBar" =~ (a+b) to be nil. What might be the corresponding * or - behaviors? It seems like a * b is closer to /(a)*(b)/ than anything else I could think of and that immediately implies: a + b becomes: /(a)+(b)/ rather than just /(a)(b)/ The fact that + is a meaningful character in a Regexp makes a universal meaning for it as an operation *on* regexps a bit ambiguous. -Rob Rob Biedenharn http://agileconsultingllc.com Rob@AgileConsultingLLC.com > What might be the corresponding * or - behaviors? It seems like > a * b > is closer to > /(a)*(b)/ That's what I'd guess for *, as well. > than anything e...

Javascript + Regex newbie question
Hi, I am relatively new to regex and have a simple question. I want to find two words within a string (and everything in between to the two words and replace it with something else. The specific string is: ....[lots of text]..................thumbnailUrl=xxxxxxxxxxxxxxxxx&..........................[more text] i want to snip out the everything in between and including thumbnailUrl= and the & and all the text in between the two words. The text between the two words is not a constant number and contains the % sign. This is what I was thinking but its not working properly: foo...

JavaScript and RegEx not working on Safari...
I am using Regular Expressions and Javascript to validate a form, specifically I want to make sure that if they try to upload a file that it has a proper name w/ certain extensions (doc,pdf, rtf). The script works on IE and Mozilla but fails on Safari on the MacOSX. Here is my code.. // ok files with proper extension var reOKFiles = /^([a-zA-Z].*|[1-9].*)\.(doc|DOC|pdf|PDF|rtf|RTF)$/; //where i check for the file... if(window.document.myForm.myDocument.value != ""){ var fileStr = window.document.myForm.myDocument.value; if(!reOKFiles.test(fileStr)){ alert("Pleas...

javascript within a javascript
Hello, I have a multi-frame page. The frames are named Frame_1, Frame_2 and Frame_3. Frame_1 has a drop down box. When a value is selected in this drop down box, Frame_2 is updated using an asp file (as I have to read from a back-end access database). In turn in the asp file, I have a Body onload tag to execute a javascript function within Frame_2. this function updates the contents of Frame_3. Till now this is working fine. Here is the problem. Frame_3 has a form with a radio button. When a radio button is selected, I want that a function be executed in Frame_3. However since conte...

javascript regex test fails
Gang, I am trying to get a regular expression test to work and can't figure out why. I will give you the code below: for (var j=0; j<document.getElementById('cmbList').options.length; j+ +) { if (document.getElementById('cmbList').options[j].value == object.firstChild.data) { strAnswer = "specific"; break; } alert('does .' + document.getElementById('cmbList').options[j].value + '. contain .' + object.firstChild.data + '.'); if (/ ^object.firstChild.data/.test(document.getElementById('cmbList').options...

Regex for JavaScript variable identifiers
I'm attempting to make a regex that will match variable identifier names without matching keywords or object properties, and matching function argument names is optional. This attempt seems to work but there has been little QA. rx = /(?:^|[=,;])\s*([a-zA-Z_$][\w$]*)\s*(?=\=)|\bvar\s*([a-zA-Z_$][\w$]*)\s* (?=[=,;\n])/g; If anyone can spot potential problems and/or how to improve them, please let me know. Thanks in advance. *** Sent via Developersdex http://www.developersdex.com *** Don't just participate in USENET...get rewarded for it! ...

Extract javascript strings using regex
Hey all, I've been trying to hammer away at this, and I just can't figure it out. I'm hoping a regular expressions guru can help me out. I'm trying to parse a retrieved javascript file to extract the parameters out of a function call. Here's a contrived line that represents what will be fetched: foo('parameter 1', 'param with \'single\' quotes', 'param with\"double \" quotes', 'this param, it has a comma', 'five'); The goal is to get an array with these elements: parameter 1 param with 'single' quotes pa...

Javascript: How to convert userinput *string* to *regex* for use in replace
I wonder how one converts a string into a real regex... if I try something like: var sRegex=userinput.value var sResult = sSomestring.replace(sRegex, "xxx"); the sRegex is interpreted as a string and not as a real /regex/ How can I change this so the user can enter a regex and have it interpreted? I guess this requires some kind of cast but to what kind of objectype? On 14 Aug 2003 08:11:32 -0700, paulgiverny@hotmail.com (Phil) wrote: >I wonder how one converts a string into a real regex... > >if I try something like: > >var sRegex=userinput.value >var sResul...

A Lange & Sohne Lange Double Split
A Lange & Sohne Lange Double Split - A Lange & Sohne Watches Discount A Lange & Sohne Lange Double Split: http://www.fashion163.com/A_Lange_Sohne_Lange_Double_Split.html Luxury Watches Lower Prices: http://www.fashion163.com/ Quality A Lange & Sohne Watches http://www.fashion163.com/a_lange_sohne.html We guarantee our A Lange & Sohne Lange Double Split and A Lange & Sohne Lange Double Split aren't just a simple imitation. We use the same fine materials and technology that the original does. Each A Lange & Sohne Lange Double Split produced is examined careful...

How make regex that means "contains regex#1 but NOT regex#2" ??
I'm looking over the docs for the re module and can't find how to "NOT" an entire regex. For example..... How make regex that means "contains regex#1 but NOT regex#2" ? Chris On 2008-07-01, seberino@spawar.navy.mil <seberino@spawar.navy.mil> wrote: > I'm looking over the docs for the re module and can't find how to > "NOT" an entire regex. (?! R) > How make regex that means "contains regex#1 but NOT regex#2" ? (\1|(?!\2)) should do what you want. Albert On Jul 1, 2:34=A0am, "A.T.Hofkamp" <h...@se-162....

Javascript and IE? Javascript and C#?
While my question might be simple, the environment around it is terribly messy and so I will try to keep this clear and simple by only including the relevant code - however, as I will soon suggest, I worry that the problem isnt in what would seem to be the relevant code but instead is lost somewhere in the do-and-donts of the peripheral code. So at the end, i'll try to include all the affected code and you can see if any of its actually relevant. I know there is a lot going on... sorry, but I tried to write this clearly. So here is what I imagine to be relevant. I'm working with a pa...

perl regex to java regex
I am trying to convert a piece of code from using perl regexes to using java regexes. I need some help with converting the following two PERL syntax regex's to java syntax regex's. Can anyone please help 1. FIRST_PERL_REGEX = "/^[\\d]*$/" 2. SECOND_PERL_REGEX = "/^[\\d]*/" (All I am trying to do in the above regex's is try to look for an integer) essentially I need the equivalent FIRST_JAVA_REGEX and SECOND_JAVA_REGEX for the above. "Rick Venter" <rick_venter@yahoo.com> wrote in message news:e6f6eb95.0310290901.38be6c02@posting.google.com.....

dynamic load javascript from javascript
hi, from here: http://www.activewidgets.com/javascript.forum.6114.15/dynamic-load-javascript-from-javascript.html there is a method to load js from js, e.g. //--------------------------------------------------------------------------------------- var script = document.createElement('script'); script.type = 'text/javascript'; script.src = 'snip.js'; document.getElementsByTagName('head')[0].appendChild(script); //--------------------------------------------------------------------------------------- this method works great, but there is a bug(?) when using wit...

Why xml:lang instead of lang?
What was the reason to introduce a new attribute "xml:lang" instead of "lang"? This bothers both authors and browsers in different language versions: HTML 4, XHTML 1.0, XHTML 1.1. HTML has only "lang"; XHTML 1.1 has only "xml:lang"; XHTML 1.0 has both! For example, Mozilla 1.7 recognizes the lang attribute http://www.unics.uni-hannover.de/nhtcapri/temp/lang-attribute.htm but it does not recognize the xml:lang attribute. http://www.unics.uni-hannover.de/nhtcapri/temp/lang-attribute.xhtml What do we gain from "xml:lang"? Andreas Prilop wrot...

regex =~ string or string =~ regex?
Sometime I saw you wrote regex =~ string, while sometime you wrote string =~ regex. What's their difference and what's the recommended way? Thanks. Jenn. On 01/04/2010 10:27 AM, Ruby Newbee wrote: > Sometime I saw you wrote regex =~ string, while sometime you wrote > string =~ regex. > What's their difference and what's the recommended way? Thanks. The first version invokes method Regexp#=~ and the second version invokes String#=~ - which happen to do roughly the same although I believe the second one to be a tad slower. I personally prefer the first form becau...

id="watchcomp.lang.javascript@62fc29b86239eb3c"> Huh?
When an achor tag is defined like this - what might be going on? Why would someone assign the id a value associated with a JavaScript method? Or is something else going on? This comes from a Google Groups web page. <a class=st id="watchcomp.lang.javascript@62fc29b86239eb3c">something</a> Thanks. gimme_this_gimme_that@yahoo.com wrote: > When an achor tag is defined like this - what might be going on? There is no such thing like "an anchor tag". The code below is an (X)HTML anchor (an [X]HTML `a' element without defined `href' attribute value bu...

Regex
Hi friends, I have a little question about a test that I want to build with a regular expression. *** Subject 1 : My users send me a dynamic list of parameters. One of this parameter is Filename= with the name of a file and I want to catch the name of the file and save it on a variable named TARGET. I wrote a Prxchange call but it works (more or less) only if I have one parameter. data cc; length c $ 60 Target $ 60; input c $60.; prxNum=prxParse('s/(filename+)=[ ]*([a-zA-Z]*).(txt)[ ]/$2/'); CALL prxChange (prxNum,1,c,Target ,newLength,wasTruncated,numberChanges); datalines; file...

Web resources about - regex with accents - comp.lang.javascript

Pitch accent - Wikipedia, the free encyclopedia
Pitch accent is a linguistic term of convenience for a variety of restricted tone systems that use variations in pitch to give prominence to ...

Color Accent Lab on the App Store on iTunes
Get Color Accent Lab on the App Store. See screenshots and ratings, and read customer reviews.

iPhone 1.1.1 Keyboard Accents - Flickr - Photo Sharing!
Press-and-hold to choose from accented character variants.

Emma Watson American Accent Funny Compilation - YouTube
Like My Facebook Page http://www.facebook.com/emmaniacs A Compilation of Emma Watson's American Accent

Cameron Daddo on life in LA, his stint as a salesman and why he rocks an Aussie accent
The fresh-faced entertainer is up for a new musical challenge.

Why Is Hollywood So Awful with Russian Translation and Accents?
One of my favorite parts from the 1985 classic, Rocky IV , is when Drago, in the finest Russian accent a quintessential Swede can muster, says, ...

If only Donald Trump had a British accent...
... he wouldn't just sound different, the meaning would change... to a freakish extent. (Via Language Log.

Star: Anne Hathaway didn’t get ‘Mary Poppins’ because her British accent sucks
... in the bag. But it’s looking like the part will go to Emily Blunt anyway. And Star Mag says that it’s Anne’s fault because she sucks at accents. ...

Fired for saying 'bra' with accent?
A Serbian woman in New York claims she was mocked for how she pronounced words like "lunch," "buyer," "sweater" and "bra."

‘Ma Lips Ah Sealed!’ Hillary Uses Black Accent In Chat With Sharpton
‘Ma Lips Ah Sealed!’ Hillary Uses Black Accent In Chat With Sharpton

Resources last updated: 3/8/2016 7:58:10 AM