list of regex special characters

  • Follow


I am looking for a list of special character in python regular
expressions that need to be escaped if you want their literal meaning.

I searched and can not find the list. Any help appreciated.
0
Reply goldtech 11/28/2010 11:58:19 PM

goldtech <goldtech@worldpost.com> writes:

> I am looking for a list of special character in python regular
> expressions that need to be escaped if you want their literal meaning.

You can avoid caring about that by using ‘re.escape’, which escapes any
characters in its input character that are not alphanumeric.

> I searched and can not find the list. Any help appreciated.

    >>> import re
    >>> help(re)
    …
    DESCRIPTION
    …
        The special characters are: …

-- 
 \     “I got some new underwear the other day. Well, new to me.” —Emo |
  `\                                                           Philips |
_o__)                                                                  |
Ben Finney
0
Reply Ben 11/29/2010 12:13:14 AM


On 11/28/2010 05:58 PM, goldtech wrote:
> I am looking for a list of special character in python regular
> expressions that need to be escaped if you want their literal meaning.
>
> I searched and can not find the list. Any help appreciated.

Trust the re module to tell you:

  >>> import re
  >>> chars = [chr(i) for i in range(0,256)]
  >>> escaped = [c for c in chars if re.escape(c) != c]
  >>> print len(escaped)
  194
  >>> print escaped
  [...]
  >>> can_use_unescaped = [c for c in chars if re.escape(c) == c]

(adjust "chars" accordingly if you want to check unicode 
characters too).

-tkc



0
Reply Tim 11/29/2010 12:23:46 AM

Tim Chase <python.list@tim.thechases.com> writes:

> On 11/28/2010 05:58 PM, goldtech wrote:
> > I am looking for a list of special character in python regular
> > expressions that need to be escaped if you want their literal
> > meaning.
>
> Trust the re module to tell you:
>
>  >>> import re
>  >>> chars = [chr(i) for i in range(0,256)]
>  >>> escaped = [c for c in chars if re.escape(c) != c]

Note that, according to its docstring, ‘re.escape’ doesn't distinguish
characters that *need to be* escaped for their literal meaning; it
simply escapes any non-alphanumeric character.

>  >>> can_use_unescaped = [c for c in chars if re.escape(c) == c]

Right. There are three classes of character for this purpose:

* those that have a literal meaning *only if* escaped
* those that have literal meaning whether or not they are escaped
* those that have a literal meaning *only if not* escaped

The ‘re.escape’ function, according to its docstring, simply says any
non-alphanumerics can safely be said to exist in one of the first two
classes, and both are safe to escape without bothering to distinguish
between them.

The OP was asking for the first class specifically, but I question
whether that's actually needed for the purpose.

-- 
 \           “The cost of education is trivial compared to the cost of |
  `\                                     ignorance.” —Thomas Jefferson |
_o__)                                                                  |
Ben Finney
0
Reply Ben 11/29/2010 12:49:27 AM

3 Replies
479 Views

(page loaded in 0.097 seconds)

Similiar Articles:













7/26/2012 10:10:35 AM


Reply: