|
|
interval expression in regexp
Hi,
According to the manual:
,-----[ (info "(gawk)Regexp Operators") lines: 2309 - 2323 ]
| Interval expressions were not traditionally available in `awk'.
| They were added as part of the POSIX standard to make `awk' and
| `egrep' consistent with each other.
|
| However, because old programs may use `{' and `}' in regexp
| constants, by default `gawk' does _not_ match interval expressions
| in regexps. If either `--posix' or `--re-interval' are specified
| (*note Options::), then interval expressions are allowed in
| regexps.
|
| For new programs that use `{' and `}' in regexp constants, it is
| good practice to always escape them with a backslash. Then the
| regexp constants are valid and work the way you want them to, using
| any version of `awk'.(2)
`-----
I thought:
gawk '/a\{3\}/'
or
gawk --posix '/a{3}/'
should work, but only the latter does. What is going on?
--
Sebastian P. Luque
|
|
0
|
|
|
|
Reply
|
Sebastian
|
11/16/2005 5:46:34 PM |
|
Sebastian Luque wrote:
> Hi,
>
> According to the manual:
>
> ,-----[ (info "(gawk)Regexp Operators") lines: 2309 - 2323 ]
> | Interval expressions were not traditionally available in `awk'.
> | They were added as part of the POSIX standard to make `awk' and
> | `egrep' consistent with each other.
> |
> | However, because old programs may use `{' and `}' in regexp
> | constants, by default `gawk' does _not_ match interval expressions
> | in regexps. If either `--posix' or `--re-interval' are specified
> | (*note Options::), then interval expressions are allowed in
> | regexps.
> |
> | For new programs that use `{' and `}' in regexp constants, it is
> | good practice to always escape them with a backslash. Then the
> | regexp constants are valid and work the way you want them to, using
> | any version of `awk'.(2)
> `-----
>
> I thought:
>
> gawk '/a\{3\}/'
>
> or
>
> gawk --posix '/a{3}/'
>
> should work, but only the latter does. What is going on?
>
>
>
The first version is consistent with the syntax of older versions of awk
and so is the default for backward compatibility as the text you quoted
explains. The second works for POSIX syntax, as would:
gawk --re-interval '/a{3}/'
Note that since gensub() is non-posix, that function is not available if
you use --posix, but it is if you use --re-interval so I'd stick to
--re-interval to avoid losing useful GNU awk functionality just to gain
RE intervals.
Ed.
|
|
0
|
|
|
|
Reply
|
Ed
|
11/16/2005 9:29:54 PM
|
|
|
1 Replies
258 Views
(page loaded in 0.711 seconds)
|
|
|
|
|
|
|
|
|