Problem with "parse arg"?

  • Follow


The following does not seem to work correctly. The second invocation of 
"argtest.rex" shown in the console log says "0 parms processed", whereas I 
believe that it should say "3 parms processed". Comments?  (Note: I have used 
both single quote and double quote in "argtest.rex", and you might need to use 
a monopitch font in order to distinguish them.)

-- from CyberSimian in the UK

-------------- console log ---------------------

E:\test\rexx>type argtest.rex
/* argtest.rex */
parse arg string
say 'parm list:'string
do i=1 by 1 while string\=''
  string=strip(string)
  if left(string,1)=='"'
  then parse var string '"' parm '"' string
  else parse var string parm string
  say 'parm="'parm'"'
end i
i=i-1
say i' parms processed.'

E:\test\rexx>argtest "abc" ""
parm list:"abc" ""
parm="abc"
parm=""
2 parms processed.

E:\test\rexx>argtest "" "abc" ""
parm list:
0 parms processed.

E:\test\rexx>rexxtry
REXX-ooRexx_3.2.0(MT) 6.02 30 Oct 2007
  rexxtry.rex lets you interactively try REXX statements.
    Each string is executed when you hit Enter.
    Enter 'call tell' for a description of the features.
  Go on - try a few...            Enter 'exit' to end.
parse version v
  ........................................... rexxtry.rex on WindowsNT
say v
REXX-ooRexx_3.2.0(MT) 6.02 30 Oct 2007
  ........................................... rexxtry.rex on WindowsNT
exit


0
Reply CyberSimian3 (13) 7/29/2008 8:21:07 AM

Greetings,

I get different results running your test program on Linux, ooRexx 3.2 as well.

mdlueck@aleks:~$ chmod 0700 argtest.rex
mdlueck@aleks:~$ ./argtest.rex "test" ""
parm list:test
parm="test"
1 parms processed.
mdlueck@aleks:~$ ./argtest.rex "" "test"
parm list:test
parm="test"
1 parms processed.
mdlueck@aleks:~$ rexx -v
Open Object Rexx Interpreter Version 3.2.0 for LINUX
Build date: Oct 30 2007
Copyright (c) IBM Corporation 1995, 2004.
Copyright (c) RexxLA 2005-2007.
All Rights Reserved.
This program and the accompanying materials
are made available under the terms of the Common Public License v1.0
which accompanies this distribution.
http://www.oorexx.org/license.html

Syntax is "rexx [-v] filename [arguments]"
or        "rexx [-e] program_string [arguments]".
mdlueck@aleks:~$

I do remember that arguments to programs at the command line are handled differently between Windows and Linux. I guess this is one such example.

-- 
Michael Lueck
Lueck Data Systems
http://www.lueckdatasystems.com/
0
Reply Michael 7/29/2008 10:58:54 AM


In Message-ID:<_JmdnTn3wJm5ThPVRVnyvQA@bt.com>,
"CyberSimian" <CyberSimian3@BeeTeeInternet.com> wrote: 

>The following does not seem to work correctly. The second invocation of 
>"argtest.rex" shown in the console log says "0 parms processed", whereas I 
>believe that it should say "3 parms processed". Comments?  (Note: I have used 
>both single quote and double quote in "argtest.rex", and you might need to use 
>a monopitch font in order to distinguish them.)

     I'm using Regina, but it still may shed some light.

     The author of Regina made a conscious decision to make the
Windows version run like the Linux version, even though Windows
could be "better".  It had to do with processing quotes on the
command line.

     My understanding is that double quotes are lost before being
passed as arguments.  This is unavoidable under Linux and by
choice under Regina.  This explains the behavior Michael noticed
under Linux.

     ooRexx obviously does see the double quotes.  It might be
worthwhile seeing what happens if you try it with single quotes
and this slight modification of your code:

/* argtest.rex */
parse arg string
say 'parm list:'string
do i=1 by 1 while string\=''
  string=strip(string)
  quote = left(string,1)
  if quote =='"' | quote == "'"
  then parse var string (quote) parm (quote) string
  else parse var string parm string
  say 'parm=<'parm'>'
end i
i=i-1
say i' parms processed.'


-- 
Arthur T. - ar23hur "at" intergate "dot" com
Looking for a z/OS (IBM mainframe) systems programmer position
0
Reply Arthur 7/29/2008 6:39:49 PM

Michael and Arthur -- thank you for interesting posts.  There are several 
points here:


(1)  I tried Arthur's modification to use the single quote, and of course that 
works.  But users invoking a command with a REXX front-end would not expect to 
enclose filespecs (or other parameters) containing blanks in single quotes, so 
I do not really see that as the right solution.

(2)  On Windows, the command (using double quotes):

      argtest "abc" ""

is processed correctly, but the command

     argtest "" "abc" ""

is not. This seems inconsistent.  If ooRexx can process the null string 
parameter correctly in the first case, why cannot ooRexx process it correctly 
in the second case? Looks like an implementation bug.

(3)  I am currently thinking about porting a DOS/OS2/Windows command-line app 
written in C to Linux, and have encountered the "vanishing double quotes 
problem" myself.  I use the Open Watcom C compiler, which has a bug in this 
area. Here is an interesting test to see if ooRexx on Linux and Regina have 
the same bug.  Try this command (using double quotes):

    argtest "a" "b    c" "d"

If ooRexx on Linux, and Regina are working optimally, you should get "3 parms 
processed". If you get "4 parms processed", ooRexx and Regina have the same 
bug as Open Watcom.

And this is the explanation: the second parameter has SIGNIFICANT blanks. To 
preserve them, the program (Open Watcom/ooRexx/Regina) needs to check the 
parameter value and if it contains one or more blanks, the parameter needs to 
be enclosded in double quotes when reconstituting the original parameter list 
for "parse arg".  So, on Linux I would expect ooRexx (if working corectly) to 
generate this for the "parse arg" instruction:

    a "b    c" d

that is, the quotes around "a" and "d" have vanished, but the quotes around "b 
c" are preserved.

Perhaps Michael/Arthur you could try this case on Linux and Regina 
respectively to see what they do? Thanks.

-- from CyberSimian in the UK 


0
Reply CyberSimian 7/30/2008 8:09:38 AM

In <_JmdnTn3wJm5ThPVRVnyvQA@bt.com>, on 07/29/2008
   at 09:21 AM, "CyberSimian" <CyberSimian3@BeeTeeInternet.com> said:

>E:\test\rexx>argtest "" "abc" ""
>parm list:
>0 parms processed.

On OS/2 I get

 [H:\]\temp\argtest "abc" ""
parm list:"abc" ""
parm="abc"
parm=""
2 parms processed.

[H:\]\temp\argtest argtest "" "abc" ""
 parm list:argtest "" "abc" ""
 parm="argtest"
 parm=""
 parm="abc"
 parm=""
 4 parms processed.

Have you tried slipping in a trace ?i and following the execution? How
about starting with the line

say 'argument string:' arg(1)

-- 
Shmuel (Seymour J.) Metz, SysProg and JOAT  <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action.  I reserve the
right to publicly post or ridicule any abusive E-mail.  Reply to
domain Patriot dot net user shmuel+news to contact me.  Do not
reply to spamtrap@library.lspace.org

0
Reply Shmuel 7/30/2008 1:09:17 PM

Arthur T. wrote:
> My understanding is that double quotes are lost before being passed
> as arguments.  This is unavoidable under Linux and by choice under
> Regina.  This explains the behavior Michael noticed under Linux.

How I long for a mechanism (perhaps some sort of system call) that would 
give my REXX access to the command *as typed by the user* before the 
system started making assumptions what the user really wanted.

I have a "select" REXX program for issuing ad-hoc database requests:

select * from table where year=2008 /* Windows OK, Linux not */
select * from table where year>1999 /* doesn't work anywhere */

A bit of history: Before Rexx started development inside IBM there were 
two versions of EXEC. They both worked in an environment where all 
commands were broken up into words with a maximum length of 8 
characters, and translated to upper case. EXEC2 had a function call 
which allowed you to get at the actual command line issued by the users. 
   This allowed you to handle stuff that was impossible before.

Of course, REXX handled long strings, and lower/mixed case from the 
outset, but I still find myself longing for that function that told me 
precisely what the user typed.

Of course, that won't really help me. My output will still get 
redirected into a file named 1999 in the second example above.
Perhaps I could find a way to write to the console even when STDOUT is 
redirected to a file?

-- 
Steve Swift
http://www.swiftys.org.uk/swifty.html
http://www.ringers.org.uk
0
Reply Steve 7/31/2008 8:15:06 AM

On 30 juil, 10:09, "CyberSimian" <CyberSimi...@BeeTeeInternet.com>
wrote:
> (2) =A0On Windows, the command (using double quotes):
>
> =A0 =A0 =A0 argtest "abc" ""
>
> is processed correctly, but the command
>
> =A0 =A0 =A0argtest "" "abc" ""
>
> is not. This seems inconsistent. =A0If ooRexx can process the null string
> parameter correctly in the first case, why cannot ooRexx process it corre=
ctly
> in the second case? Looks like an implementation bug.

Indeed, it's a bug... The rexx function arg() returns 0 instead of 1.
Got a look at the source files :
ooRexx for Windows calls the Win32 function GetCommandLine() which
returns
the command-line string for the current process. Ex :
c:\myexepath\rexx.exe c:\myscriptpath\argtest.rex "" "abc ""
This string is parsed by the function getArguments() to build the
arguments
string that will be passed to the rexx script.
After having called twice the function nextArgument() to skip the
executable
and the rexx script, nextArgument() is called again to test if there
are script
arguments.
When the first arg is "" then nextArgument() returns null because the
arg's length
is 0. And the argument counter remains to zero. Not good...

Under Unix, the arguments string is built by iterating over argv[] and
concatenating the values. The split made by the C runtime in argv[] is
lost
because of the concatenation :
        if (arg_buffer[0] !=3D '\0')     /* not the first
one?                */
          strcat(arg_buffer, " ");     /* add an
blank                      */
        strcat(arg_buffer, argv[i]);   /* add this to the argument
string   */
Since there is no surrounding quotes around each arg, I don't see how
a rexx
script can rebuild the array from the string...

The bug under Windows can be fixed.
We should have the same behavior under Unix, but I don't know if a
function
similar to GetCommandLine() exists under Unix...

And to be complete, the feature request 1167002 "Extend Rexx command
line argument access."
should be implemented. Summary : The Parse instruction and the Arg()
builtin function should be extended to allow access to the "raw"
command arguments.

Jean-Louis
0
Reply Jean 7/31/2008 9:47:03 AM

On Thu, 31 Jul 2008 08:15:06 UTC, Steve Swift 
<Steve.J.Swift@gmail.com> wrote:

> Arthur T. wrote:
> > My understanding is that double quotes are lost before being passed
> > as arguments.  This is unavoidable under Linux and by choice under
> > Regina.  This explains the behavior Michael noticed under Linux.
> 
> How I long for a mechanism (perhaps some sort of system call) that would 
> give my REXX access to the command *as typed by the user* before the 
> system started making assumptions what the user really wanted.
> 
> I have a "select" REXX program for issuing ad-hoc database requests:
> 
> select * from table where year=2008 /* Windows OK, Linux not */
> select * from table where year>1999 /* doesn't work anywhere */
> 
> Of course, that won't really help me. My output will still get 
> redirected into a file named 1999 in the second example above.

Have you tried quoting the SQL string? Perhaps this would prevent the 
redirection.

> Perhaps I could find a way to write to the console even when STDOUT is 
> redirected to a file?

Have you tried writing to STDERR?

 


-- 

John Small

0
Reply John 7/31/2008 9:57:22 AM

Jean-Louis F. wrote:
> Under Unix, the arguments string is built by iterating over argv[] and
> concatenating the values.

In the interests of fixing these problems, I hope that the reguler REXX 
followers will permit a bit of C...

If REXX.EXE gets invoked with the normal parameters for a C main program, you 
will have this as the declaration:

  int main(int argc, char* argv[]) {etc...}

argc specifies the number of parameters passed to the C program;
argv is an array of pointers to parameter values.

argv[0] is the filespec of the program invoked (i.e the REXX interpreter from 
the point of view of the OS).

argv[1] is the name of the user-written REXX program to be interpreted.

So the parameters to the user's REXX program begin with argv[2].

If the Linux shell is working as I expect, this is the C code that you need in 
order to preserve significant blanks and significant double quotes:

#define BLANK_CHR ' '   /* single blank char */
#define QUOTE_STR "\""  /* double quote as null-term string */
arg_buffer[0]=0;
for (i=2; i<=argc; i++)
{
  if ( argv[i][0]==0 /* null string? */
    || strchr(argv[i],BLANK_CHR)!=NULL ) /* or blank present? */
  {
    strcat(arg_buffer,QUOTE_STR); /* open quote */
    strcat(arg_buffer,argv[i]); /* parameter value */
    strcat(arg_buffer,QUOTE_STR); /* close quote */
    if (i<argc) /* not the last parameter?
      strcat(arg_buffer,BLANK_STR); /* add separating blank */
  }
}

-- from CyberSimian in the UK 


0
Reply CyberSimian 7/31/2008 4:55:16 PM

Addendum:

As well as checking for the presence of a blank in the parameter value, there
are other characters that should probably be checked for too:

(1)  Asterisk.
If the user specifies an unquoted asterisk on the invocation of the REXX
program, the Linux shell will replace it by a list of all files and
directories in the current directory. If the user encloses the asterisk in
double quotes, the asterisk is passed to the REXX program unchanged (but with
the double quotes removed, of course).  I would expect that if the user
specified ABC*, the Linux shell would replace that by a list of files and
directories with names begin ABC (I have not actually tried that, so I may be
wrong).  Therefore, if the REXX program receives a parameter that contains an
asterisk, there must have been double quotes around it that should be
re-instated when "parse arg" is processed.

(2)  Redirection symbols "<" and ">".
On Windows, the redirection symbols are interpreted as simple data characters
if they are in quoted strings; I would assume that the same applies on Linux
(but I have never tried that, so I may be wrong).  So again, if the REXX
program receives a parameter containing either < or >, there must have been
double quotes around it originally, and they should be re-instated.

(3)  Other characters.
Same processing for other special characters (although none come to mind at
the moment).

-- from CyberSiman in the UK




0
Reply CyberSimian 7/31/2008 5:16:47 PM

CyberSimian wrote:
> Jean-Louis F. wrote:
>> Under Unix, the arguments string is built by iterating over argv[] and
>> concatenating the values.
> 
> In the interests of fixing these problems, I hope that the reguler REXX 
> followers will permit a bit of C...
> 
> If REXX.EXE gets invoked with the normal parameters for a C main program, you 
> will have this as the declaration:
> 
>   int main(int argc, char* argv[]) {etc...}
> 
> argc specifies the number of parameters passed to the C program;
> argv is an array of pointers to parameter values.
> 
> argv[0] is the filespec of the program invoked (i.e the REXX interpreter from 
> the point of view of the OS).
> 
> argv[1] is the name of the user-written REXX program to be interpreted.
> 
> So the parameters to the user's REXX program begin with argv[2].
> 
> If the Linux shell is working as I expect, this is the C code that you need in 
> order to preserve significant blanks and significant double quotes:
> 
> #define BLANK_CHR ' '   /* single blank char */
> #define QUOTE_STR "\""  /* double quote as null-term string */
> arg_buffer[0]=0;
> for (i=2; i<=argc; i++)
> {
>   if ( argv[i][0]==0 /* null string? */
>     || strchr(argv[i],BLANK_CHR)!=NULL ) /* or blank present? */
>   {
>     strcat(arg_buffer,QUOTE_STR); /* open quote */
>     strcat(arg_buffer,argv[i]); /* parameter value */
>     strcat(arg_buffer,QUOTE_STR); /* close quote */
>     if (i<argc) /* not the last parameter?
>       strcat(arg_buffer,BLANK_STR); /* add separating blank */
>   }
> }
> 
> -- from CyberSimian in the UK 
> 
> 
There's one additional condition to worry about, and that's an argument 
that might have included an escaped quote (and a blank) originally. 
Just slapping a pair of quotes around the argument would not be entirely 
correct.

Rick
0
Reply Rick 7/31/2008 5:19:30 PM

Addendum 2:

On reviewing the suggested C code, I see that I did not get it quite right, as 
it does not handle "normal" parameters (i.e. ones that are not null and which 
do not contain a blank).  Correcting the code is left as an exercise for the 
reader...

-- from CyberSimian in the UK 


0
Reply CyberSimian 7/31/2008 5:25:38 PM

Shmuel (Seymour J.) Metz wrote:
> [H:\]\temp\argtest argtest "" "abc" ""
> parm list:argtest "" "abc" ""
> parm="argtest"
> parm=""
> parm="abc"
> parm=""
> 4 parms processed.

You slipped in an extra "argtest" here, which is why it says "4 parms
processed".  However, your post prompted me to boot up my OS/2 Warp 4.0
system and try this on Classic REXX ("parse version" gives "REXXSAA 4.00 24 
Aug 1996").  On OS/2 Classic REXX the "argtest" program gives the expected 
result (3 parms processed) for the case that ooRexx gets wrong, i.e.:

argtest "" "abc" ""

-- from CyberSimian in the UK 


0
Reply CyberSimian 7/31/2008 6:19:44 PM

In <Hc6dnWKrdpfanw_VRVnyhgA@bt.com>, on 07/31/2008
   at 07:19 PM, "CyberSimian" <CyberSimian3@BeeTeeInternet.com> said:

>On OS/2 Classic REXX the "argtest" program gives the expected  result (3
>parms processed) for the case that ooRexx gets wrong, i.e.:

>argtest "" "abc" ""

My test, which gave a correct value, was usiong OREXX. What do you get if
you add a

say arg: arg(1)

to argtest.cmd?

-- 
Shmuel (Seymour J.) Metz, SysProg and JOAT  <http://patriot.net/~shmuel>

Unsolicited bulk E-mail subject to legal action.  I reserve the
right to publicly post or ridicule any abusive E-mail.  Reply to
domain Patriot dot net user shmuel+news to contact me.  Do not
reply to spamtrap@library.lspace.org

0
Reply Shmuel 7/31/2008 7:04:03 PM

Rick McGuire wrote
> There's one additional condition to worry about, and that's an argument that 
> might have included an escaped quote (and a blank) originally. Just slapping 
> a pair of quotes around the argument would not be entirely correct.

Good point.  But how does one escape a double quote on the operating-system's 
or shell's command line?

Within REXX, one uses the doubling-up method to specify a double quote within 
a string delimited by double quotes (not sure who did that first -- PL/I?). 
On Windows, ooRexx gets the command line exactly as typed by the user, and 
apparently Windows does no escape processing:

E:\test\rexx>type escape.rex
/* escape.rex */
parse arg parms
say 'parms=>>>'parms'<<<'

E:\test\rexx>escape "abc"
parms=>>>"abc"<<<

E:\test\rexx>escape "a"c"
parms=>>>"a"c"<<<

E:\test\rexx>escape "a""c"
parms=>>>"a""c"<<<

E:\test\rexx>escape "a\"c"
parms=>>>"a\"c"<<<

I don't have ooRexx installed on my Open Suse 10.3 system (I'm a Linux tyro!), 
but given it's "techie" origins, I would surmise that the Linux shells use the 
C backslash method to escape a double quote; i.e. if you ran "escape.rex" in 
Linux you would get this:

escape "a\"c"
parms=>>>a"c<<<

If it really does behave that way, the parameter received by the user's REXX 
program is different to the one received on Windows. And what do the Linux 
shells do with two consecutive double quotes within a parameter delimited by 
double quotes?  Does that become two separate parameters?

escape "a""c"
parms=>>>a c<<<

In the former case ooRexx could recognise that an escape sequence had been 
specified, and fix-up the parameter list appropriately, but it could not do 
that in the latter case.  So a solution that is watertight on all operating 
systems may not be possible in the absence of an OS or shell call that returns 
an exact copy of the original parameter list.

Be that as it may, I think that ooRexx could handle some of these cases 
correctly (e.g. parameter containing a blank), even if it cannot handle all of 
them correctly.

-- from CyberSimian in the UK 


0
Reply CyberSimian 8/1/2008 9:07:03 AM

John Small wrote:
> Have you tried quoting the SQL string? Perhaps this would prevent the 
> redirection.

My brain knows how to do that, but the message rarely reaches my 
fingers. :-)
In other cases where I'm highly likely to enter redirection or wildcard 
characters, I resort to ignoring the command line and prompting for the 
command to run. I'd like to avoid that here.

>> Perhaps I could find a way to write to the console even when STDOUT is 
>> redirected to a file?
> Have you tried writing to STDERR?

Yes, that would bypass the immediate problem. I'm a bit reluctant to do 
this though, as I've had bad experiences with other programs which 
reversed the normal usage of STDOUT/STDERR.  It would certainly help me 
out when I accidentally entered a ">" in my command. Of course, if the 
">" got interpreted as redirection then my SQL would fail because the 
syntax "where colname >" is invalid.

I think it's time to bite the bullet and teach my fingers to handle 
correctly those characters liable to cause problems.

-- 
Steve Swift
http://www.swiftys.org.uk/swifty.html
http://www.ringers.org.uk
0
Reply Steve 8/1/2008 10:49:27 AM

15 Replies
116 Views

(page loaded in 0.26 seconds)

Similiar Articles:


















7/30/2012 4:35:52 PM


Reply: