While playing with grep, I was suprised by grep '*\.c' not giving
an error (* is missing an operand). Arguably * applied to empty
can match empty, but surprisingly enough, Acme's edit behaves
differently. And even grep is not consistent (grep '*' is different than
grep '' whereas both should be an empty pattern or the first one
should be an error). Another funny one is that Edit gives back
an error complaining of missing operand to * when the regexp is
empty.
Greps from other systems accept an empty pattern
(and are thus consistent but they would not have
catched the error starting all this).
cpu% echo hola | grep '*a'
hola
cpu% echo hola | grep '*'
grep: *: syntax error
cpu% echo hola | grep ''
grep: empty pattern
Edit , s/*//
regexp: missing operand for *
Edit: bad regexp in s command
Edit , s/*c//
regexp: missing operand for *
Edit: bad regexp in s command
Edit , s///
regexp: missing operand for *
Edit: bad regexp in s command
G.
|
|
0
|
|
|
|
Reply
|
paurea (247)
|
6/14/2012 8:28:09 AM |
|
--90e6ba6153a07d0f4a04c26b6235
Content-Type: text/plain; charset=UTF-8
This is from manpage, but I not sure what _exactly_ it means, and whether
it applies to your problem:
Care should be taken when using the shell metacharacters
$*[^|()=\ and newline in pattern; it is safest to enclose
the entire expression in single quotes '...'. An expression
starting with '*' will treat the rest of the expression as
literal characters.
more strange behavior:
% echo foo.c | 9 grep '*\.c'
%
% echo foo.c | 9 grep '*.c'
foo.c
% echo fooxc | 9 grep '*.c'
%
% echo fooxc | 9 grep '.*.c'
fooxc
% echo fooxc | 9 grep '.*\.c'
%
% echo foo.c | 9 grep '.*\.c'
foo.c
% echo foo.c | 9 grep '*foo.c'
foo.c
% echo foo.c | 9 grep '*.00.c'
%
Looks like
" An expression
starting with '*' will treat the rest of the expression as
literal characters."
(see above) really applies (for unknown reasons).
However, I am just a 'toy programmer', so you were warned ;-)
Regards,
++pac
On Thu, Jun 14, 2012 at 10:28 AM, Gorka Guardiola <paurea@gmail.com> wrote:
> While playing with grep, I was suprised by grep '*\.c' not giving
> an error (* is missing an operand). Arguably * applied to empty ... [snip]
>
--90e6ba6153a07d0f4a04c26b6235
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
This is from manpage, but I not sure what _exactly_ it means, and whether i=
t applies to your problem:<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0=C2=A0 Care should be taken when using the shell metacharacters<br>=C2=
=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 $*[^|()=3D\ and newline=
in pattern; it is safest to enclose<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 the entire expressio=
n in single quotes '...'.=C2=A0 An expression<br>=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 starting with '*' will treat t=
he rest of the expression as<br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=
=C2=A0=C2=A0 literal characters.<br><br>more strange behavior:<br>
% echo foo.c | 9 grep '*\.c'<br>% <br>% echo foo.c | 9 grep '*.=
c'<br>foo.c<br>% echo fooxc | 9 grep '*.c'<br>% <br>% echo foox=
c | 9 grep '.*.c'<br>fooxc<br>% echo fooxc | 9 grep '.*\.c'=
<br>
%<br>% echo foo.c | 9 grep '.*\.c'<br>foo.c<br>% echo foo.c | 9 gre=
p '*foo.c'<br>foo.c<br>% echo foo.c | 9 grep '*.00.c'<br>% =
<br><br>Looks like <br>=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=
=A0 " An expression<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 starting with '*=
' will treat the rest of the expression as<br>
=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 literal characters.&=
quot; <br>(see above) really applies (for unknown reasons).<br><br><br>Howe=
ver, I am just a 'toy programmer', so you were warned ;-)<br>Regard=
s,<br>++pac<br><br><br><div class=3D"gmail_quote">
On Thu, Jun 14, 2012 at 10:28 AM, Gorka Guardiola <span dir=3D"ltr"><<a =
href=3D"mailto:paurea@gmail.com" target=3D"_blank">paurea@gmail.com</a>>=
</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .=
8ex;border-left:1px #ccc solid;padding-left:1ex">
While playing with grep, I was suprised by grep '*\.c' not giving<b=
r>
an error (* is missing an operand). Arguably * applied to empty ... [snip]<=
br></blockquote></div>
--90e6ba6153a07d0f4a04c26b6235--
|
|
0
|
|
|
|
Reply
|
tyapca7 (101)
|
6/14/2012 9:32:56 AM
|
|
% sed -n 20,32p /sys/src/cmd/grep/grep.y
prog: /* empty */
{
yyerror("empty pattern");
}
| expr newlines
{
$$.beg = ral(Tend);
$$.end = $$.beg;
$$ = re2cat(re2star(re2or(re2char(0x00, '\n'-1), re2char('\n'+1, 0xff))), $$);
$$ = re2cat($1, $$);
$$ = re2cat(re2star(re2char(0x00, 0xff)), $$);
topre = $$;
}
%
The above code sets up the initial state
machine including the pattern passed on
the command line, $1.
This combined with the fact that multiple
"stars" are coalesced causes the weirdness
you're seeing.
Anthony
|
|
0
|
|
|
|
Reply
|
ality (59)
|
6/14/2012 9:42:19 AM
|
|
On Thu, Jun 14, 2012 at 11:32 AM, Peter A. Cejchan <tyapca7@gmail.com> wrot=
e:
> This is from manpage, but I not sure what _exactly_ it means, and whether=
it
> applies to your problem:
> =A0=A0=A0=A0=A0=A0=A0=A0=A0 Care should be taken when using the shell met=
acharacters
> =A0=A0=A0=A0=A0=A0=A0=A0=A0 $*[^|()=3D\ and newline in pattern; it is saf=
est to enclose
> =A0=A0=A0=A0=A0=A0=A0=A0=A0 the entire expression in single quotes '...'.=
=A0 An expression
> =A0=A0=A0=A0=A0=A0=A0=A0=A0 starting with '*' will treat the rest of the =
expression as
> =A0=A0=A0=A0=A0=A0=A0=A0=A0 literal characters.
>
Everything is enclosed in '' the shell is not seeing this.
G.
|
|
0
|
|
|
|
Reply
|
paurea (247)
|
6/14/2012 9:54:28 AM
|
|
> % echo foo.c | 9 grep '*\.c'
correct. match \.c as a literal string. there is no match.
> % echo foo.c | 9 grep '*.c'
> foo.c
correct. match .c as a littal string. there is a match.
> % echo fooxc | 9 grep '*.c'
> %
> % echo fooxc | 9 grep '.*.c'
> fooxc
correct. match 0-n any character then 1 any character then a c. there is a match.
> % echo fooxc | 9 grep '.*\.c'
correct. this time there's no match because '.' is treated as a literal not
a pattern.
> % echo foo.c | 9 grep '.*\.c'
> foo.c
correct. match 0-n any characters, then a literal '.' then literal 'c'. there is a match.
> % echo foo.c | 9 grep '*foo.c'
> foo.c
correct. match the literal string foo.c. there is a match.
remember that the match doesn't have to be anchored by default, so i sometimes
do this
grep $somesym `{find /sys/src|grep '\.[chys]$'}
this is also packaged up in the local version of 'g'; this would be equivalent
g $somesym /sys/src
- erik
|
|
0
|
|
|
|
Reply
|
quanstro3716 (244)
|
6/14/2012 1:28:07 PM
|
|
> This combined with the fact that multiple
> "stars" are coalesced causes the weirdness
> you're seeing.
there is no case of multiple '*'s in the patterns peter gave.
there is a case of patterns beginning with '*' which treats the
rest of the pattern as a literal, but that's different.
- erik
|
|
0
|
|
|
|
Reply
|
quanstro (3877)
|
6/14/2012 1:29:55 PM
|
|
|
5 Replies
28 Views
(page loaded in 0.102 seconds)
|