FYI fnmatch("\\\\", "\", 0) in DJGPP returns 1(no match) -- the escaping
functionality seems to be broken.
The same function call returns 0 (match) in Linux (Red Hat 9 kernel
2.4.20) -- what it should be.
Alex
|
|
0
|
|
|
|
Reply
|
alexfru (352)
|
11/29/2006 9:28:07 AM |
|
Alexei A. Frounze wrote:
> FYI fnmatch("\\\\", "\", 0) in DJGPP returns 1(no match) -- the
obvious correction (missed one slash):
fnmatch("\\\\", "\\", 0)
> escaping functionality seems to be broken.
> The same function call returns 0 (match) in Linux (Red Hat 9 kernel
> 2.4.20) -- what it should be.
Alex
|
|
0
|
|
|
|
Reply
|
alexfru (352)
|
11/29/2006 9:30:06 AM
|
|
Alexei A. Frounze <alexfru@chat.ru> wrote:
> Alexei A. Frounze wrote:
>> FYI fnmatch("\\\\", "\", 0) in DJGPP returns 1(no match) -- the
> obvious correction (missed one slash):
> fnmatch("\\\\", "\\", 0)
>> escaping functionality seems to be broken.
>> The same function call returns 0 (match) in Linux (Red Hat 9 kernel
>> 2.4.20) -- what it should be.
Info page says
" #include <fnmatch.h>
int fnmatch(const char *pattern, const char *string, int flags);
Description
-----------
This function indicates if STRING matches the PATTERN. ..."
So DJGPP says that "\" doesn't match "\\" while Linux says it does.
Well, I say DJGPP is right as the pattern says there should be two
backslashes and you only provide one.
Right,
MartinS
|
|
0
|
|
|
|
Reply
|
ams767 (91)
|
11/29/2006 3:53:40 PM
|
|
> This function indicates if STRING matches the PATTERN. ..."
>
> So DJGPP says that "\" doesn't match "\\" while Linux says it does.
>
> Well, I say DJGPP is right as the pattern says there should be two
> backslashes and you only provide one.
Except that PATTERN is a regex influenced by FNM_NOESCAPE and
FNM_PATHNAME, and STRING isn't. So a pattern of "\\" is a single
escaped backslash, whereas a string of "\" is a single backslash.
They should match.
|
|
0
|
|
|
|
Reply
|
dj (1321)
|
11/29/2006 4:34:38 PM
|
|
fOn Wed, 29 Nov 2006 11:34:38 -0500 in comp.os.msdos.djgpp, DJ Delorie
<dj@delorie.com> wrote:
>
>> This function indicates if STRING matches the PATTERN. ..."
>>
>> So DJGPP says that "\" doesn't match "\\" while Linux says it does.
>>
>> Well, I say DJGPP is right as the pattern says there should be two
>> backslashes and you only provide one.
>
>Except that PATTERN is a regex influenced by FNM_NOESCAPE and
>FNM_PATHNAME, and STRING isn't. So a pattern of "\\" is a single
>escaped backslash, whereas a string of "\" is a single backslash.
>They should match.
switch ((c = *pattern++))
{
....
....
....
case '\\':
/*+++ pattern already post-incremented to point to next char */
if (!(flags & FNM_NOESCAPE) && pattern[1] && strchr("*?[\\",
pattern[1]))
/*+++ should be:
if (!(flags & FNM_NOESCAPE) && strchr("*?[\\", *pattern))
*+++ as end of input pattern will match end char in escapes string */
{
/*+++ end of input pattern might be clearer with ! or == '\0' */
if ((c = *pattern++) == 0)
{
c = '\\';
--pattern;
}
if (c != *string++)
return FNM_NOMATCH;
break;
}
--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada
Brian.Inglis@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply
|
|
0
|
|
|
|
Reply
|
Brian.Inglis (300)
|
11/30/2006 5:42:21 AM
|
|
Brian Inglis wrote:
> fOn Wed, 29 Nov 2006 11:34:38 -0500 in comp.os.msdos.djgpp, DJ Delorie
> <dj@delorie.com> wrote:
>
>>
>>> This function indicates if STRING matches the PATTERN. ..."
>>>
>>> So DJGPP says that "\" doesn't match "\\" while Linux says it does.
>>>
>>> Well, I say DJGPP is right as the pattern says there should be two
>>> backslashes and you only provide one.
>>
>> Except that PATTERN is a regex influenced by FNM_NOESCAPE and
>> FNM_PATHNAME, and STRING isn't. So a pattern of "\\" is a single
>> escaped backslash, whereas a string of "\" is a single backslash.
>> They should match.
>
> switch ((c = *pattern++))
> {
> ...
> ...
> ...
> case '\\':
> /*+++ pattern already post-incremented to point to next char */
> if (!(flags & FNM_NOESCAPE) && pattern[1] && strchr("*?[\\",
> pattern[1]))
> /*+++ should be:
> if (!(flags & FNM_NOESCAPE) && strchr("*?[\\", *pattern))
> *+++ as end of input pattern will match end char in escapes string */
> {
> /*+++ end of input pattern might be clearer with ! or == '\0' */
> if ((c = *pattern++) == 0)
> {
> c = '\\';
> --pattern;
> }
> if (c != *string++)
> return FNM_NOMATCH;
> break;
> }
I don't think the above is enough. There's another problem. With the above
code you'd never see (c = *pattern++) == 0. My bet is that the intent was to
treat the slash in the last character of pattern as an ordinary character.
That would explain the {c = '\\'; --pattern;} thing along with the
fallthrough behavior. But the code is broken in this place. Dunno if it was
tested against the single unix spec or just a little bit to see that it
seems to work (in some basic cases).
Alex
|
|
0
|
|
|
|
Reply
|
alexfru (352)
|
11/30/2006 6:31:24 AM
|
|
Alexei A. Frounze wrote:
> Brian Inglis wrote:
>> fOn Wed, 29 Nov 2006 11:34:38 -0500 in comp.os.msdos.djgpp, DJ
>> Delorie <dj@delorie.com> wrote:
>>
>>>
>>>> This function indicates if STRING matches the PATTERN. ..."
>>>>
>>>> So DJGPP says that "\" doesn't match "\\" while Linux says it does.
>>>>
>>>> Well, I say DJGPP is right as the pattern says there should be two
>>>> backslashes and you only provide one.
>>>
>>> Except that PATTERN is a regex influenced by FNM_NOESCAPE and
>>> FNM_PATHNAME, and STRING isn't. So a pattern of "\\" is a single
>>> escaped backslash, whereas a string of "\" is a single backslash.
>>> They should match.
>>
>> switch ((c = *pattern++))
>> {
>> ...
>> ...
>> ...
>> case '\\':
>> /*+++ pattern already post-incremented to point to next char */
>> if (!(flags & FNM_NOESCAPE) && pattern[1] && strchr("*?[\\",
>> pattern[1]))
>> /*+++ should be:
>> if (!(flags & FNM_NOESCAPE) && strchr("*?[\\", *pattern))
>> *+++ as end of input pattern will match end char in escapes string */
>> {
>> /*+++ end of input pattern might be clearer with ! or == '\0' */
>> if ((c = *pattern++) == 0)
>> {
>> c = '\\';
>> --pattern;
>> }
>> if (c != *string++)
>> return FNM_NOMATCH;
>> break;
>> }
>
> I don't think the above is enough. There's another problem. With the
> above code you'd never see (c = *pattern++) == 0. My bet is that the
> intent was to treat the slash in the last character of pattern as an
> ordinary character. That would explain the {c = '\\'; --pattern;}
> thing along with the fallthrough behavior. But the code is broken in
> this place. Dunno if it was tested against the single unix spec or
> just a little bit to see that it seems to work (in some basic cases).
One more thing to consider, closing bracket as first character in the
list/range inside the bracket expression:
fnmatch("[]]", "]", 0) must return 0, doesn't
fnmatch("[!]]", "]", 0) must return 1, does (by luck)
fnmatch("[!]]", "a", 0) must return 0, doesn't
And one more, which seems to be partially wrong even in that same RH9 linux
distro:
fnmatch("\ab\c","abc",0) must return 0, does in linux, doesn't in DJGPP
fnmatch("\[abc\]","[abc]",0) must return 0, doesn't in both linux and
DJGPP -- the spec doesn't make an exception for the escaped opening bracket,
\[ when describes the escaping option. Dunno, maybe it would be wise not to
escape the bracket, but other characters should be made escapable w/o a
problem.
Alex
P.S. all the details obtained from fnmatch()'s description in The Single
Unix Specification V3 2004 issue 6.
P.P.S. of course there's a ton of what DJGPP's fnmatch() doesn't support,
but the above things are pretty basic and it would be nice to have them
handled properly, unless I overlook some major DOS-related issue for which
it would be desirable to deviate from the spec.
|
|
0
|
|
|
|
Reply
|
alexfru (352)
|
11/30/2006 8:20:00 AM
|
|
On Wed, 29 Nov 2006 22:31:24 -0800 in comp.os.msdos.djgpp, "Alexei A.
Frounze" <alexfru@chat.ru> wrote:
>Brian Inglis wrote:
>> fOn Wed, 29 Nov 2006 11:34:38 -0500 in comp.os.msdos.djgpp, DJ Delorie
>> <dj@delorie.com> wrote:
>>
>>>
>>>> This function indicates if STRING matches the PATTERN. ..."
>>>>
>>>> So DJGPP says that "\" doesn't match "\\" while Linux says it does.
>>>>
>>>> Well, I say DJGPP is right as the pattern says there should be two
>>>> backslashes and you only provide one.
>>>
>>> Except that PATTERN is a regex influenced by FNM_NOESCAPE and
>>> FNM_PATHNAME, and STRING isn't. So a pattern of "\\" is a single
>>> escaped backslash, whereas a string of "\" is a single backslash.
>>> They should match.
>>
>> switch ((c = *pattern++))
>> {
>> ...
>> ...
>> ...
>> case '\\':
>> /*+++ pattern already post-incremented to point to next char */
>> if (!(flags & FNM_NOESCAPE) && pattern[1] && strchr("*?[\\",
>> pattern[1]))
>> /*+++ should be:
>> if (!(flags & FNM_NOESCAPE) && strchr("*?[\\", *pattern))
>> *+++ as end of input pattern will match end char in escapes string */
>> {
>> /*+++ end of input pattern might be clearer with ! or == '\0' */
>> if ((c = *pattern++) == 0)
>> {
>> c = '\\';
>> --pattern;
>> }
>> if (c != *string++)
>> return FNM_NOMATCH;
>> break;
>> }
>
>I don't think the above is enough. There's another problem. With the above
>code you'd never see (c = *pattern++) == 0. My bet is that the intent was to
>treat the slash in the last character of pattern as an ordinary character.
AFAICS it does: strchr matches the nul pattern terminator with the nul
escapes constant terminator and returns a pointer to that character,
so the result is non-zero/true, as noted in the third +++ line.
If "&& *pattern &&" was included, and a nul pattern terminator was
encountered, that if statement would be false and fallthrough to the
following default case, which is clearly not intended by the
subsequent test for the nul pattern terminator, which fixes things up
to treat a terminal \ as an ordinary character.
>That would explain the {c = '\\'; --pattern;} thing along with the
>fallthrough behavior.
The break ensures that the fallthrough default case is skipped if an
escape is valid and there is a match, otherwise the previous return
happens.
>But the code is broken in this place. Dunno if it was
>tested against the single unix spec or just a little bit to see that it
>seems to work (in some basic cases).
Handling of non-special escape characters here is punted to the
fallthrough default case and treated the same as FNM_NOESCAPE!
Not sure if that is intentional or just inadequate.
If non-special escape characters should be treated as a literal
character, the strchr() should be eliminated and only the nul
terminator handled specially as it is.
The resulting code would look like:
case '\\': /* escape or directory */
if (!(flags & FNM_NOESCAPE)) /* if escapes allowed */
{
if ((c = *pattern++) == '\0') /* if next char end */
{
c = '\\'; /* trailing \ stays put */
--pattern; /* backup ptr to end */
}
if (c != *string++) /* mismatch */
return FNM_NOMATCH; /* quit */
break; /* done */
}
--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada
Brian.Inglis@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply
|
|
0
|
|
|
|
Reply
|
Brian.Inglis (300)
|
12/1/2006 10:49:42 AM
|
|
Alexei A. Frounze wrote:
> Alexei A. Frounze wrote:
>> Brian Inglis wrote:
>>> fOn Wed, 29 Nov 2006 11:34:38 -0500 in comp.os.msdos.djgpp, DJ
>>> Delorie <dj@delorie.com> wrote:
>>>
>>>>
>>>>> This function indicates if STRING matches the PATTERN. ..."
>>>>>
>>>>> So DJGPP says that "\" doesn't match "\\" while Linux says it
>>>>> does. Well, I say DJGPP is right as the pattern says there should be
>>>>> two
>>>>> backslashes and you only provide one.
>>>>
>>>> Except that PATTERN is a regex influenced by FNM_NOESCAPE and
>>>> FNM_PATHNAME, and STRING isn't. So a pattern of "\\" is a single
>>>> escaped backslash, whereas a string of "\" is a single backslash.
>>>> They should match.
>>>
>>> switch ((c = *pattern++))
>>> {
>>> ...
>>> ...
>>> ...
>>> case '\\':
>>> /*+++ pattern already post-incremented to point to next char */
>>> if (!(flags & FNM_NOESCAPE) && pattern[1] && strchr("*?[\\",
>>> pattern[1]))
>>> /*+++ should be:
>>> if (!(flags & FNM_NOESCAPE) && strchr("*?[\\", *pattern))
>>> *+++ as end of input pattern will match end char in escapes string
>>> */ {
>>> /*+++ end of input pattern might be clearer with ! or == '\0' */
>>> if ((c = *pattern++) == 0)
>>> {
>>> c = '\\';
>>> --pattern;
>>> }
>>> if (c != *string++)
>>> return FNM_NOMATCH;
>>> break;
>>> }
>>
>> I don't think the above is enough. There's another problem. With the
>> above code you'd never see (c = *pattern++) == 0. My bet is that the
>> intent was to treat the slash in the last character of pattern as an
>> ordinary character. That would explain the {c = '\\'; --pattern;}
>> thing along with the fallthrough behavior. But the code is broken in
>> this place. Dunno if it was tested against the single unix spec or
>> just a little bit to see that it seems to work (in some basic cases).
>
> One more thing to consider, closing bracket as first character in the
> list/range inside the bracket expression:
> fnmatch("[]]", "]", 0) must return 0, doesn't
> fnmatch("[!]]", "]", 0) must return 1, does (by luck)
> fnmatch("[!]]", "a", 0) must return 0, doesn't
>
> And one more, which seems to be partially wrong even in that same RH9
> linux distro:
> fnmatch("\ab\c","abc",0) must return 0, does in linux, doesn't in
> DJGPP fnmatch("\[abc\]","[abc]",0) must return 0, doesn't in both
> linux and DJGPP -- the spec doesn't make an exception for the escaped
> opening
> bracket, \[ when describes the escaping option. Dunno, maybe it would
> be wise not to escape the bracket, but other characters should be
> made escapable w/o a problem.
Actually, I was wrong about linux failing fnmatch("\[abc\]","[abc]",0)==0 --
I didn't put quotation marks around arguments to fnmatch that were passed to
it from the command line and therefore fnmatch wasn't comparing the same
thing (shell stripped some stuff). So, the above two things are only wrong
in DJGPP.
> Alex
> P.S. all the details obtained from fnmatch()'s description in The
> Single Unix Specification V3 2004 issue 6.
> P.P.S. of course there's a ton of what DJGPP's fnmatch() doesn't
> support, but the above things are pretty basic and it would be nice
> to have them handled properly, unless I overlook some major
> DOS-related issue for which it would be desirable to deviate from the
> spec.
Another couple of examples revealing incorrect behavior of fnmatch() in
DJGPP:
fnmatch("*\a", "a", 0) must return 0, doesn't
fnmatch("\[a]", "[a]", 0) must return 0, doesn't
fnmatch("\[a]", "a", 0) must return 1, does
So, as I understand it, the fnmatch() code flaws are:
1. rangematch() doesn't allow for the following two patterns: []...] and
[!]...] where ] is a valid char in the range
2. asterisk handling doesn't distinguish in the following the various
options for c and what follows c:
else if (isslash(c) && flags & FNM_PATHNAME)
{
if ((string = find_slash(string)) == NULL)
return FNM_NOMATCH;
break;
}
a) c=='/' // forward slash
b) c=='\', (flags & FNM_NOESCAPE)!=0 // back slash
c) c=='\', (flags & FNM_NOESCAPE)==0, isslash(pattern[1])==1 // any escaped
slash
3. '\\' handling is completely broken (wrong indices and logic). If escaping
is on, it must fall through the case to default if '\\' is followed by
anything from "\\?*[" to make sure those aren't interpreted as special chars
again but instead are interpreted as ordinary chars. For all other chars
(including '\0') it's better to break out from the case/switch to treat
those chars as ordinary by the other existing cases.
Alex
|
|
0
|
|
|
|
Reply
|
alexfru (352)
|
12/1/2006 11:14:01 AM
|
|
On Thu, 30 Nov 2006 00:20:00 -0800 in comp.os.msdos.djgpp, "Alexei A.
Frounze" <alexfru@chat.ru> wrote:
>One more thing to consider, closing bracket as first character in the
>list/range inside the bracket expression:
>fnmatch("[]]", "]", 0) must return 0, doesn't
>fnmatch("[!]]", "]", 0) must return 1, does (by luck)
>fnmatch("[!]]", "a", 0) must return 0, doesn't
rangematch() for should be a do ... while.
>And one more, which seems to be partially wrong even in that same RH9 linux
>distro:
>fnmatch("\ab\c","abc",0) must return 0, does in linux, doesn't in DJGPP
>fnmatch("\[abc\]","[abc]",0) must return 0, doesn't in both linux and
>DJGPP -- the spec doesn't make an exception for the escaped opening bracket,
>\[ when describes the escaping option. Dunno, maybe it would be wise not to
>escape the bracket, but other characters should be made escapable w/o a
>problem.
Fixing escape handling and range handling should fix those.
>P.S. all the details obtained from fnmatch()'s description in The Single
>Unix Specification V3 2004 issue 6.
>P.P.S. of course there's a ton of what DJGPP's fnmatch() doesn't support,
>but the above things are pretty basic and it would be nice to have them
>handled properly, unless I overlook some major DOS-related issue for which
>it would be desirable to deviate from the spec.
Notice that escapes aren't handled in ranges; should they be?
Should escaped control characters be allowed?
Shouldn't all \ use in paths be conditional on FNM_NOESCAPE?
I'll need to checkout my SUS.
--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada
Brian.Inglis@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply
|
|
0
|
|
|
|
Reply
|
Brian.Inglis (300)
|
12/1/2006 11:23:54 AM
|
|
Brian Inglis wrote:
> On Thu, 30 Nov 2006 00:20:00 -0800 in comp.os.msdos.djgpp, "Alexei A.
> Frounze" <alexfru@chat.ru> wrote:
>
>> One more thing to consider, closing bracket as first character in the
>> list/range inside the bracket expression:
>> fnmatch("[]]", "]", 0) must return 0, doesn't
>> fnmatch("[!]]", "]", 0) must return 1, does (by luck)
>> fnmatch("[!]]", "a", 0) must return 0, doesn't
>
> rangematch() for should be a do ... while.
There're many ways to skin the cat. I changed it to for(;;) and added a flag
to look for ] at first execution of the body.
>> And one more, which seems to be partially wrong even in that same
>> RH9 linux distro:
>> fnmatch("\ab\c","abc",0) must return 0, does in linux, doesn't in
>> DJGPP fnmatch("\[abc\]","[abc]",0) must return 0, doesn't in both
>> linux and
>> DJGPP -- the spec doesn't make an exception for the escaped opening
>> bracket, \[ when describes the escaping option. Dunno, maybe it
>> would be wise not to escape the bracket, but other characters should
>> be made escapable w/o a problem.
>
> Fixing escape handling and range handling should fix those.
Yep.
>> P.S. all the details obtained from fnmatch()'s description in The
>> Single Unix Specification V3 2004 issue 6.
>> P.P.S. of course there's a ton of what DJGPP's fnmatch() doesn't
>> support, but the above things are pretty basic and it would be nice
>> to have them handled properly, unless I overlook some major
>> DOS-related issue for which it would be desirable to deviate from
>> the spec.
>
> Notice that escapes aren't handled in ranges; should they be?
No, they shouldn't. That's what the spec says. Also, if FNM_PATHNAME is set,
slashes in ranges must not match slashes in the name, but this check is
there already.
> Should escaped control characters be allowed?
They shouldn't because the spec says that when escaping is enabled back
slash followed by a char is that char. So, if you mean \n, \r, \t with their
C meaning - those aren't possible because there already exist another rule
for them. However, I don't think there's any harm in treating a single back
slash at the end of the pattern as nul char when escaping is on. I'd
consider this a programmatical error anyway.
> Shouldn't all \ use in paths be conditional on FNM_NOESCAPE?
Well, here's a twist... Back slashes (\) aren't normally subdirectory/file
name separators. Normally, the forward slash (/) serves the role of such a
separator and the backward slash(\) is used for escaping. That's what SUS
has. In DJGPP because of DOS conventions an attempt was made to make it
possible to use both types of slashes as name separators. So, this gives in
DJGPP 2 functions to the back slash (\): the separator and escape char and
those have to be distinguished appropriately. That's why I commented on the
existence of 3 possible options for a slash following an asterisk.
> I'll need to checkout my SUS.
Sure.
Alex
|
|
0
|
|
|
|
Reply
|
alexfru (352)
|
12/1/2006 12:02:11 PM
|
|
On Fri, 1 Dec 2006 03:14:01 -0800 in comp.os.msdos.djgpp, "Alexei A.
Frounze" <alexfru@chat.ru> wrote:
>Alexei A. Frounze wrote:
>> Alexei A. Frounze wrote:
>>> Brian Inglis wrote:
>>>> fOn Wed, 29 Nov 2006 11:34:38 -0500 in comp.os.msdos.djgpp, DJ
>>>> Delorie <dj@delorie.com> wrote:
>>>>
>>>>>
>>>>>> This function indicates if STRING matches the PATTERN. ..."
>>>>>>
>>>>>> So DJGPP says that "\" doesn't match "\\" while Linux says it
>>>>>> does. Well, I say DJGPP is right as the pattern says there should be
>>>>>> two
>>>>>> backslashes and you only provide one.
>>>>>
>>>>> Except that PATTERN is a regex influenced by FNM_NOESCAPE and
>>>>> FNM_PATHNAME, and STRING isn't. So a pattern of "\\" is a single
>>>>> escaped backslash, whereas a string of "\" is a single backslash.
>>>>> They should match.
>>>>
>>>> switch ((c = *pattern++))
>>>> {
>>>> ...
>>>> ...
>>>> ...
>>>> case '\\':
>>>> /*+++ pattern already post-incremented to point to next char */
>>>> if (!(flags & FNM_NOESCAPE) && pattern[1] && strchr("*?[\\",
>>>> pattern[1]))
>>>> /*+++ should be:
>>>> if (!(flags & FNM_NOESCAPE) && strchr("*?[\\", *pattern))
>>>> *+++ as end of input pattern will match end char in escapes string
>>>> */ {
>>>> /*+++ end of input pattern might be clearer with ! or == '\0' */
>>>> if ((c = *pattern++) == 0)
>>>> {
>>>> c = '\\';
>>>> --pattern;
>>>> }
>>>> if (c != *string++)
>>>> return FNM_NOMATCH;
>>>> break;
>>>> }
>>>
>>> I don't think the above is enough. There's another problem. With the
>>> above code you'd never see (c = *pattern++) == 0. My bet is that the
>>> intent was to treat the slash in the last character of pattern as an
>>> ordinary character. That would explain the {c = '\\'; --pattern;}
>>> thing along with the fallthrough behavior. But the code is broken in
>>> this place. Dunno if it was tested against the single unix spec or
>>> just a little bit to see that it seems to work (in some basic cases).
>>
>> One more thing to consider, closing bracket as first character in the
>> list/range inside the bracket expression:
>> fnmatch("[]]", "]", 0) must return 0, doesn't
>> fnmatch("[!]]", "]", 0) must return 1, does (by luck)
>> fnmatch("[!]]", "a", 0) must return 0, doesn't
>>
>> And one more, which seems to be partially wrong even in that same RH9
>> linux distro:
>> fnmatch("\ab\c","abc",0) must return 0, does in linux, doesn't in
>> DJGPP fnmatch("\[abc\]","[abc]",0) must return 0, doesn't in both
>> linux and DJGPP -- the spec doesn't make an exception for the escaped
>> opening
>> bracket, \[ when describes the escaping option. Dunno, maybe it would
>> be wise not to escape the bracket, but other characters should be
>> made escapable w/o a problem.
>
>Actually, I was wrong about linux failing fnmatch("\[abc\]","[abc]",0)==0 --
>I didn't put quotation marks around arguments to fnmatch that were passed to
>it from the command line and therefore fnmatch wasn't comparing the same
>thing (shell stripped some stuff). So, the above two things are only wrong
>in DJGPP.
>
>> Alex
>> P.S. all the details obtained from fnmatch()'s description in The
>> Single Unix Specification V3 2004 issue 6.
N.B. details from SUSV3 http://unix.org
"2.13.3 Patterns Used for Filename Expansion
The rules described so far in Patterns Matching a Single Character and
Patterns Matching Multiple Characters are qualified by the following
rules that apply when pattern matching notation is used for filename
expansion:
1. The slash character in a pathname shall be explicitly matched by
using one or more slashes in the pattern; it shall neither be matched
by the asterisk or question-mark special characters nor by a bracket
expression. Slashes in the pattern shall be identified before bracket
expressions; thus, a slash cannot be included in a pattern bracket
expression used for filename expansion. If a slash character is found
following an unescaped open square bracket character before a
corresponding closing square bracket is found, the open bracket shall
be treated as an ordinary character. For example, the pattern
"a[b/c]d" does not match such pathnames as abd or a/d. It only matches
a pathname of literally a[b/c]d.
2. If a filename begins with a period ( '.' ), the period shall be
explicitly matched by using a period as the first character of the
pattern or immediately following a slash character. The leading period
shall not be matched by:
* The asterisk or question-mark special characters
* A bracket expression containing a non-matching list, such as
"[!a]", a range expression, such as "[%-0]", or a character class
expression, such as "[[:punct:]]"
It is unspecified whether an explicit period in a bracket
expression matching list, such as "[.abc]", can match a leading period
in a filename."
"The flags argument shall modify the interpretation of pattern and
string. It is the bitwise-inclusive OR of zero or more of the flags
defined in <fnmatch.h>. If the FNM_PATHNAME flag is set in flags, then
a slash character ( '/' ) in string shall be explicitly matched by a
slash in pattern; it shall not be matched by either the asterisk or
question-mark special characters, nor by a bracket expression. If the
FNM_PATHNAME flag is not set, the slash character shall be treated as
an ordinary character.
If FNM_NOESCAPE is not set in flags, a backslash character ( '\' ) in
pattern followed by any other character shall match that second
character in string. In particular, "\\" shall match a backslash in
string. If FNM_NOESCAPE is set, a backslash character shall be treated
as an ordinary character.
If FNM_PERIOD is set in flags, then a leading period ( '.' ) in string
shall match a period in pattern; as described by rule 2 in the Shell
and Utilities volume of IEEE Std 1003.1-2001, Section 2.13.3, Patterns
Used for Filename Expansion where the location of "leading" is
indicated by the value of FNM_PATHNAME:
* If FNM_PATHNAME is set, a period is "leading" if it is the first
character in string or if it immediately follows a slash.
* If FNM_PATHNAME is not set, a period is "leading" only if it is the
first character of string.
If FNM_PERIOD is not set, then no special restrictions are placed on
matching a period."
>> P.P.S. of course there's a ton of what DJGPP's fnmatch() doesn't
>> support, but the above things are pretty basic and it would be nice
>> to have them handled properly, unless I overlook some major
>> DOS-related issue for which it would be desirable to deviate from the
>> spec.
>
>Another couple of examples revealing incorrect behavior of fnmatch() in
>DJGPP:
>fnmatch("*\a", "a", 0) must return 0, doesn't
>fnmatch("\[a]", "[a]", 0) must return 0, doesn't
>fnmatch("\[a]", "a", 0) must return 1, does
>
>So, as I understand it, the fnmatch() code flaws are:
>1. rangematch() doesn't allow for the following two patterns: []...] and
>[!]...] where ] is a valid char in the range
ISTM that the [- and [!- cases where - is treated as a literal
character aren't handled either.
>2. asterisk handling doesn't distinguish in the following the various
>options for c and what follows c:
> else if (isslash(c) && flags & FNM_PATHNAME)
> {
> if ((string = find_slash(string)) == NULL)
> return FNM_NOMATCH;
> break;
> }
>a) c=='/' // forward slash
>b) c=='\', (flags & FNM_NOESCAPE)!=0 // back slash
>c) c=='\', (flags & FNM_NOESCAPE)==0, isslash(pattern[1])==1 // any escaped
>slash
>3. '\\' handling is completely broken (wrong indices and logic). If escaping
>is on, it must fall through the case to default if '\\' is followed by
>anything from "\\?*[" to make sure those aren't interpreted as special chars
>again but instead are interpreted as ordinary chars. For all other chars
>(including '\0') it's better to break out from the case/switch to treat
>those chars as ordinary by the other existing cases.
Multiple slashes should also be treated the same as a single slash.
Leading period handling does not seem to be dealt with either, nor the
DOS _ equivalent: perhaps this should be handled similar to slash and
backslash, only treated the same as period when FNM_NOESCAPE and
FNM_PERIOD are specified.
--
Thanks. Take care, Brian Inglis Calgary, Alberta, Canada
Brian.Inglis@CSi.com (Brian[dot]Inglis{at}SystematicSW[dot]ab[dot]ca)
fake address use address above to reply
|
|
0
|
|
|
|
Reply
|
Brian.Inglis (300)
|
12/2/2006 8:33:03 AM
|
|
|
11 Replies
18 Views
(page loaded in 0.344 seconds)
Similiar Articles:7/22/2012 1:20:43 AM
|