Newbie Question: delete all non alphanumeric characters

  • Follow


Hi all,
how can i delete all non alphanumeric characters in a string ? thanks

-- 
Posted via http://www.ruby-forum.com/.

0
Reply theallnighter (1) 7/21/2006 5:53:38 PM

On Jul 21, 2006, at 1:53 PM, Theallnighter Theallnighter wrote:

> Hi all,
> how can i delete all non alphanumeric characters in a string ? thanks
>
> -- 
> Posted via http://www.ruby-forum.com/.
>

string.gsub(/[0-9a-z]+/i, '')


0
Reply logancapaldo (886) 7/21/2006 6:00:40 PM


Logan Capaldo wrote:
>
> On Jul 21, 2006, at 1:53 PM, Theallnighter Theallnighter wrote:
>
>> Hi all,
>> how can i delete all non alphanumeric characters in a string ? thanks
>>
>> -- 
>> Posted via http://www.ruby-forum.com/.
>>
>
> string.gsub(/[0-9a-z]+/i, '')
>
>
>
That deletes all alphanumeric. To delete all non-alphanumeric:

string.gsub(/[^0-9a-z]/i, '')

-- 
Tom Werner
Helmets to Hardhats
Software Developer
tom@helmetstohardhats.org
www.helmetstohardhats.org


0
Reply tom8029 (33) 7/21/2006 6:05:55 PM

On Jul 21, 2006, at 2:05 PM, Tom Werner wrote:

> Logan Capaldo wrote:
>>
>> On Jul 21, 2006, at 1:53 PM, Theallnighter Theallnighter wrote:
>>
>>> Hi all,
>>> how can i delete all non alphanumeric characters in a string ?  
>>> thanks
>>>
>>> -- 
>>> Posted via http://www.ruby-forum.com/.
>>>
>>
>> string.gsub(/[0-9a-z]+/i, '')
>>
>>
>>
> That deletes all alphanumeric. To delete all non-alphanumeric:
>
> string.gsub(/[^0-9a-z]/i, '')
>
> -- 
> Tom Werner
> Helmets to Hardhats
> Software Developer
> tom@helmetstohardhats.org
> www.helmetstohardhats.org
>
>

Doh! I'm obviously not awake yet this ---err-- afternoon.


0
Reply logancapaldo (886) 7/21/2006 6:29:20 PM

Tom Werner wrote:
> Logan Capaldo wrote:
>> On Jul 21, 2006, at 1:53 PM, Theallnighter Theallnighter wrote:
>>> Hi all,
>>> how can i delete all non alphanumeric characters in a string ? thanks
>>
>> string.gsub(/[0-9a-z]+/i, '')
>>
> That deletes all alphanumeric. To delete all non-alphanumeric:
> 
> string.gsub(/[^0-9a-z]/i, '')

Or string.delete("^0-9a-zA-Z")


mathew
-- 
      <URL:http://www.pobox.com/~meta/>
My parents went to the lost kingdom of Hyrule
    and all I got was this lousy triforce.
0
Reply meta (132) 7/21/2006 6:54:57 PM

On 2006-07-21, Theallnighter Theallnighter <theallnighter@gmail.com> wrote:
> Hi all,
> how can i delete all non alphanumeric characters in a string ? thanks
>

I've also just started to learn Ruby, so thought I'd reply for the practice -
Here's one solution:


------------------------------------------------------------------------
#!/usr/bin/ruby

x = "There are 2007 beans and 15234 grains of rice in this bag."
puts x
x.gsub!(/\W/, '')
puts x

------------------------------------------------------------------------

output:

There are 2007 beans and 15234 grains of rice in this bag.
Thereare2007beansand15234grainsofriceinthisbag

-- 

0
Reply allergic-to-spam (87) 7/21/2006 7:38:22 PM

On Jul 21, 2006, at 3:40 PM, Jim Cochrane wrote:

> On 2006-07-21, Theallnighter Theallnighter  
> <theallnighter@gmail.com> wrote:
>> Hi all,
>> how can i delete all non alphanumeric characters in a string ? thanks
>>
>
> I've also just started to learn Ruby, so thought I'd reply for the  
> practice -
> Here's one solution:
>
>
> ---------------------------------------------------------------------- 
> --
> #!/usr/bin/ruby
>
> x = "There are 2007 beans and 15234 grains of rice in this bag."
> puts x
> x.gsub!(/\W/, '')
> puts x
>
> ---------------------------------------------------------------------- 
> --
>
> output:
>
> There are 2007 beans and 15234 grains of rice in this bag.
> Thereare2007beansand15234grainsofriceinthisbag
>
> -- 
>
>

Well the only "problem" with that is

x = '\w includes_under_scores_too'


0
Reply logancapaldo (886) 7/21/2006 7:48:24 PM

On 2006-07-21, Logan Capaldo <logancapaldo@gmail.com> wrote:
>
> On Jul 21, 2006, at 3:40 PM, Jim Cochrane wrote:
>
>> On 2006-07-21, Theallnighter Theallnighter  
>> <theallnighter@gmail.com> wrote:
>>> Hi all,
>>> how can i delete all non alphanumeric characters in a string ? thanks
>>>
>> ...
>> #!/usr/bin/ruby
>>
>> x = "There are 2007 beans and 15234 grains of rice in this bag."
>> puts x
>> x.gsub!(/\W/, '')
>> puts x
>> ...
>>
>>
>
> Well the only "problem" with that is
>
> x = '\w includes_under_scores_too'
>

Woah!  Thanks for pointing that out.  It looks like
http://www.ruby-doc.org/docs/ruby-doc-bundle/UsersGuide/rg/regexp.html
has a bug:

\w	letter or digit; same as [0-9A-Za-z]

It's missing a _.

Here's a fixed version:


#!/usr/bin/ruby

x = "There are 2007 beans_and 15234 grains of rice in this bag."
puts x
x.gsub!(/\W/, '')
puts x
x.gsub!(/\W|_/, '')
puts "fixed:"
puts x
0
Reply allergic-to-spam (87) 7/21/2006 8:01:14 PM

for fun, I started irb, then typed

"567576hgjhgjh&**)".gsub(/^[0-9a-z]/i, '')

It returned

67576hgjhgjh&**)

Tom Werner wrote:
> Logan Capaldo wrote:
> >
> > On Jul 21, 2006, at 1:53 PM, Theallnighter Theallnighter wrote:
> >
> >> Hi all,
> >> how can i delete all non alphanumeric characters in a string ? thanks
> >>
> >> --
> >> Posted via http://www.ruby-forum.com/.
> >>
> >
> > string.gsub(/[0-9a-z]+/i, '')
> >
> >
> >
> That deletes all alphanumeric. To delete all non-alphanumeric:
>
> string.gsub(/[^0-9a-z]/i, '')
>
> --
> Tom Werner
> Helmets to Hardhats
> Software Developer
> tom@helmetstohardhats.org
> www.helmetstohardhats.org

0
Reply dominique.plante (7) 7/21/2006 8:10:27 PM

On 2006-07-21, Jim Cochrane <allergic-to-spam@no-spam-allowed.org> wrote:
> On 2006-07-21, Logan Capaldo <logancapaldo@gmail.com> wrote:
>>
>> On Jul 21, 2006, at 3:40 PM, Jim Cochrane wrote:
>>
>>> On 2006-07-21, Theallnighter Theallnighter  
>>> <theallnighter@gmail.com> wrote:
>>>> Hi all,
>>>> how can i delete all non alphanumeric characters in a string ? thanks
>>>>
>>> ...
>>> #!/usr/bin/ruby
>>>
>>> x = "There are 2007 beans and 15234 grains of rice in this bag."
>>> puts x
>>> x.gsub!(/\W/, '')
>>> puts x
>>> ...
>>>
>>>
>>
>> Well the only "problem" with that is
>>
>> x = '\w includes_under_scores_too'
>>
>
> Woah!  Thanks for pointing that out.  It looks like
> http://www.ruby-doc.org/docs/ruby-doc-bundle/UsersGuide/rg/regexp.html
> has a bug:
>
> \w	letter or digit; same as [0-9A-Za-z]
>
> It's missing a _.
>
> Here's a fixed version:
>
>
> #!/usr/bin/ruby
>
> x = "There are 2007 beans_and 15234 grains of rice in this bag."
> puts x
> x.gsub!(/\W/, '')
> puts x
> x.gsub!(/\W|_/, '')
> puts "fixed:"
> puts x

Oops - the above has a bug (although it still "works").  Here's a fixed
version, with an opposite example further demonstrating the bug in the
ruby doc site:


#!/usr/bin/ruby

s = "There are 2007 beans_and 15234 grains of rice in this bag."
x = s.dup
y = s.dup
puts "original:"
puts x
x.gsub!(/\W/, '')
puts "\nbroken:"
puts x
y.gsub!(/\W|_/, '')
puts "\nfixed:"
puts y

puts "\nopposite:"
z = s.dup
z.gsub!(/\w/, '')
puts z

-- 

original:
There are 2007 beans_and 15234 grains of rice in this bag.

broken:
Thereare2007beans_and15234grainsofriceinthisbag

fixed:
Thereare2007beansand15234grainsofriceinthisbag

opposite:
          .

0
Reply allergic-to-spam (87) 7/21/2006 8:15:10 PM

dominique.plante@gmail.com wrote:
> for fun, I started irb, then typed
>
> "567576hgjhgjh&**)".gsub(/^[0-9a-z]/i, '')
>
> It returned
>
> 67576hgjhgjh&**)
>
>   

The carat goes inside the brackets (it inverses the character class)

Tom

-- 
Tom Werner
Helmets to Hardhats
Software Developer
tom@helmetstohardhats.org
www.helmetstohardhats.org


0
Reply tom8029 (33) 7/21/2006 8:19:19 PM

> for fun, I started irb, then typed
>
> "567576hgjhgjh&**)".gsub(/^[0-9a-z]/i, '')
>
> It returned
>
> 67576hgjhgjh&**)

No wonder. There was only one character at the begining of the string....



Regards,
Rimantas
--
http://rimantas.com/

0
Reply rimantas (102) 7/21/2006 8:19:26 PM

On 21-Jul-06, at 4:19 PM, Tom Werner wrote:

> dominique.plante@gmail.com wrote:
>> for fun, I started irb, then typed
>>
>> "567576hgjhgjh&**)".gsub(/^[0-9a-z]/i, '')
>>
>> It returned
>>
>> 67576hgjhgjh&**)
>>
>>
>
> The carat goes inside the brackets (it inverses the character class)

And it should look like this:

"567576hgjhgjh&**)".sub(/[^0-9a-zA-Z]+/i, '')

Note the +

> Tom

--
Jeremy Tregunna
jtregunna@blurgle.ca


"One serious obstacle to the adoption of good programming languages  
is the notion that everything has to be sacrificed for speed. In  
computer languages as in life, speed kills." -- Mike Vanier


0
Reply jtregunna (85) 7/21/2006 8:35:28 PM

Jeremy Tregunna wrote:
>
> And it should look like this:
>
> "567576hgjhgjh&**)".sub(/[^0-9a-zA-Z]+/i, '')
>
> Note the +
>

#sub only does one replacement; adding a + will replace one chunk of 
non-alphas, but not any others in the string.

Tom

-- 
Tom Werner
Helmets to Hardhats
Software Developer
tom@helmetstohardhats.org
www.helmetstohardhats.org


0
Reply tom8029 (33) 7/21/2006 8:44:57 PM

On 21-Jul-06, at 4:44 PM, Tom Werner wrote:

> Jeremy Tregunna wrote:
>>
>> And it should look like this:
>>
>> "567576hgjhgjh&**)".sub(/[^0-9a-zA-Z]+/i, '')
>>
>> Note the +
>>
>
> #sub only does one replacement; adding a + will replace one chunk  
> of non-alphas, but not any others in the string.

typo, sorry.

> Tom

--
Jeremy Tregunna
jtregunna@blurgle.ca


"One serious obstacle to the adoption of good programming languages  
is the notion that everything has to be sacrificed for speed. In  
computer languages as in life, speed kills." -- Mike Vanier


0
Reply jtregunna (85) 7/21/2006 10:15:06 PM

On Jul 21, 2006, at 6:15 PM, Jeremy Tregunna wrote:

>
> On 21-Jul-06, at 4:44 PM, Tom Werner wrote:
>
>> Jeremy Tregunna wrote:
>>>
>>> And it should look like this:
>>>
>>> "567576hgjhgjh&**)".sub(/[^0-9a-zA-Z]+/i, '')
>>>
>>> Note the +
>>>
>>
>> #sub only does one replacement; adding a + will replace one chunk  
>> of non-alphas, but not any others in the string.
>
> typo, sorry.

Speaking of typos, say either a-zA-Z or a-z/i, you don't need both <g>

>
>> Tom
>
> --
> Jeremy Tregunna
> jtregunna@blurgle.ca
>
>
> "One serious obstacle to the adoption of good programming languages  
> is the notion that everything has to be sacrificed for speed. In  
> computer languages as in life, speed kills." -- Mike Vanier
>
>


0
Reply logancapaldo (886) 7/22/2006 4:13:40 PM

On 7/21/06, Theallnighter Theallnighter <theallnighter@gmail.com> wrote:
> Hi all,
> how can i delete all non alphanumeric characters in a string ? thanks
>
> --
> Posted via http://www.ruby-forum.com/.
>
>

TMTOWTDI:

username.delete('^A-Za-z0-9')

...I just thought I'd add a little variety to this collection of
Regexp-centric solutions.

0
Reply JoeKarma (7) 7/22/2006 8:52:30 PM

16 Replies
257 Views

(page loaded in 0.229 seconds)

Similiar Articles:


















7/22/2012 4:13:36 PM


Reply: