Running awk from Matlab system command

  • Follow


Hi,
I can run the following awk command from within Matlab (it counts the number of unique entries in column 2):

system(['C:\cygwin\bin\gawk -F, ''NR!=1  { exchNo[$2]+=1 } END { for ( id in exchNo ) print id,exchNo[id] } ''  '  Filename ]);

However, when I add an if statment with > or < the Matlab system command strips this off and hence the command fails. For example the following fails in Matlab (but works directly within cygwin):

system(['C:\cygwin\bin\gawk -F, ''NR!=1  { if ($9<10) exchNo[$2]+=1 } END { for ( id in exchNo ) print id,exchNo[id] } ''  '  Filename ]);

Is there a way of stopping Matlab from removing > or <?

Thanks
0
Reply Bruce 2/6/2011 10:17:03 PM

On Feb 7, 11:17=A0am, "Bruce Clarke" <bruce.cla...@team.telstra.com>
wrote:
> Hi,
> I can run the following awk command from within Matlab (it counts the num=
ber of unique entries in column 2):
>
> system(['C:\cygwin\bin\gawk -F, ''NR!=3D1 =A0{ exchNo[$2]+=3D1 } END { fo=
r ( id in exchNo ) print id,exchNo[id] } '' =A0' =A0Filename ]);
>
> However, when I add an if statment with > or < the Matlab system command =
strips this off and hence the command fails. For example the following fail=
s in Matlab (but works directly within cygwin):
>
> system(['C:\cygwin\bin\gawk -F, ''NR!=3D1 =A0{ if ($9<10) exchNo[$2]+=3D1=
 } END { for ( id in exchNo ) print id,exchNo[id] } '' =A0' =A0Filename ]);
>
> Is there a way of stopping Matlab from removing > or <?
>
> Thanks

Well, I don't believe you.
I use system all the time with similarly complicated commands and
Matlab simply doesn't misbehave in the fashion you assert.
May I suggest that you do the following:
comm=3D (['C:\cygwin\bin\gawk -F, ''NR!=3D1  { if ($9<10) exchNo[$2]+=3D1 }
END { for ( id in exchNo ) print id,exchNo[id] } ''  '  Filename ]
and check that what is printed out is exactly what you expect.
Then you can do this:
system(comm)
0
Reply TideMan 2/6/2011 10:48:53 PM


TideMan <mulgor@gmail.com> wrote in message <10f7fb02-c1c5-4e93-90c9-cd2b9e87e4c8@o39g2000prb.googlegroups.com>...
> On Feb 7, 11:17 am, "Bruce Clarke" <bruce.cla...@team.telstra.com>
> wrote:
> > Hi,
> > I can run the following awk command from within Matlab (it counts the number of unique entries in column 2):
> >
> > system(['C:\cygwin\bin\gawk -F, ''NR!=1  { exchNo[$2]+=1 } END { for ( id in exchNo ) print id,exchNo[id] } ''  '  Filename ]);
> >
> > However, when I add an if statment with > or < the Matlab system command strips this off and hence the command fails. For example the following fails in Matlab (but works directly within cygwin):
> >
> > system(['C:\cygwin\bin\gawk -F, ''NR!=1  { if ($9<10) exchNo[$2]+=1 } END { for ( id in exchNo ) print id,exchNo[id] } ''  '  Filename ]);
> >
> > Is there a way of stopping Matlab from removing > or <?
> >
> > Thanks
> 
> Well, I don't believe you.
> I use system all the time with similarly complicated commands and
> Matlab simply doesn't misbehave in the fashion you assert.
> May I suggest that you do the following:
> comm= (['C:\cygwin\bin\gawk -F, ''NR!=1  { if ($9<10) exchNo[$2]+=1 }
> END { for ( id in exchNo ) print id,exchNo[id] } ''  '  Filename ]
> and check that what is printed out is exactly what you expect.
> Then you can do this:
> system(comm)

Hi,
Thanks for the suggestion. Actually this is exactly what I do but get a syntax error:

gawk: NR!=1 { if ($9 exchNo[$2]+=1 }END { for ( id in exchNo ) print id,exchNo[id] } 
gawk:                          ^ syntax error

As you can see some text following the < is removed resulting in the error. If I eliminate the if statement then it works and if I replace < by = then this also works. So it appears that the system command is removing  the <.
0
Reply Bruce 2/7/2011 12:03:05 AM

On Feb 7, 1:03=A0pm, "Bruce Clarke" <bruce.cla...@team.telstra.com>
wrote:
> TideMan <mul...@gmail.com> wrote in message <10f7fb02-c1c5-4e93-90c9-cd2b=
9e87e...@o39g2000prb.googlegroups.com>...
> > On Feb 7, 11:17=A0am, "Bruce Clarke" <bruce.cla...@team.telstra.com>
> > wrote:
> > > Hi,
> > > I can run the following awk command from within Matlab (it counts the=
 number of unique entries in column 2):
>
> > > system(['C:\cygwin\bin\gawk -F, ''NR!=3D1 =A0{ exchNo[$2]+=3D1 } END =
{ for ( id in exchNo ) print id,exchNo[id] } '' =A0' =A0Filename ]);
>
> > > However, when I add an if statment with > or < the Matlab system comm=
and strips this off and hence the command fails. For example the following =
fails in Matlab (but works directly within cygwin):
>
> > > system(['C:\cygwin\bin\gawk -F, ''NR!=3D1 =A0{ if ($9<10) exchNo[$2]+=
=3D1 } END { for ( id in exchNo ) print id,exchNo[id] } '' =A0' =A0Filename=
 ]);
>
> > > Is there a way of stopping Matlab from removing > or <?
>
> > > Thanks
>
> > Well, I don't believe you.
> > I use system all the time with similarly complicated commands and
> > Matlab simply doesn't misbehave in the fashion you assert.
> > May I suggest that you do the following:
> > comm=3D (['C:\cygwin\bin\gawk -F, ''NR!=3D1 =A0{ if ($9<10) exchNo[$2]+=
=3D1 }
> > END { for ( id in exchNo ) print id,exchNo[id] } '' =A0' =A0Filename ]
> > and check that what is printed out is exactly what you expect.
> > Then you can do this:
> > system(comm)
>
> Hi,
> Thanks for the suggestion. Actually this is exactly what I do but get a s=
yntax error:
>
> gawk: NR!=3D1 { if ($9 exchNo[$2]+=3D1 }END { for ( id in exchNo ) print =
id,exchNo[id] }
> gawk: =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0^ syntax error
>
> As you can see some text following the < is removed resulting in the erro=
r. If I eliminate the if statement then it works and if I replace < by =3D =
then this also works. So it appears that the system command is removing =A0=
the <.

When I use sed, I use double quotes,char(34), not apostrophe
apostrophe, char(39) char(39).
Maybe Matlab is getting confused with a string within a string.
0
Reply TideMan 2/7/2011 12:19:35 AM

TideMan <mulgor@gmail.com> wrote in message <66b4375a-e96f-4ba4-b290-536a362c93ce@i39g2000prd.googlegroups.com>...
> On Feb 7, 1:03 pm, "Bruce Clarke" <bruce.cla...@team.telstra.com>
> wrote:
> > TideMan <mul...@gmail.com> wrote in message <10f7fb02-c1c5-4e93-90c9-cd2b9e87e...@o39g2000prb.googlegroups.com>...
> > > On Feb 7, 11:17 am, "Bruce Clarke" <bruce.cla...@team.telstra.com>
> > > wrote:
> > > > Hi,
> > > > I can run the following awk command from within Matlab (it counts the number of unique entries in column 2):
> >
> > > > system(['C:\cygwin\bin\gawk -F, ''NR!=1  { exchNo[$2]+=1 } END { for ( id in exchNo ) print id,exchNo[id] } ''  '  Filename ]);
> >
> > > > However, when I add an if statment with > or < the Matlab system command strips this off and hence the command fails. For example the following fails in Matlab (but works directly within cygwin):
> >
> > > > system(['C:\cygwin\bin\gawk -F, ''NR!=1  { if ($9<10) exchNo[$2]+=1 } END { for ( id in exchNo ) print id,exchNo[id] } ''  '  Filename ]);
> >
> > > > Is there a way of stopping Matlab from removing > or <?
> >
> > > > Thanks
> >
> > > Well, I don't believe you.
> > > I use system all the time with similarly complicated commands and
> > > Matlab simply doesn't misbehave in the fashion you assert.
> > > May I suggest that you do the following:
> > > comm= (['C:\cygwin\bin\gawk -F, ''NR!=1  { if ($9<10) exchNo[$2]+=1 }
> > > END { for ( id in exchNo ) print id,exchNo[id] } ''  '  Filename ]
> > > and check that what is printed out is exactly what you expect.
> > > Then you can do this:
> > > system(comm)
> >
> > Hi,
> > Thanks for the suggestion. Actually this is exactly what I do but get a syntax error:
> >
> > gawk: NR!=1 { if ($9 exchNo[$2]+=1 }END { for ( id in exchNo ) print id,exchNo[id] }
> > gawk:                          ^ syntax error
> >
> > As you can see some text following the < is removed resulting in the error. If I eliminate the if statement then it works and if I replace < by = then this also works. So it appears that the system command is removing  the <.
> 
> When I use sed, I use double quotes,char(34), not apostrophe
> apostrophe, char(39) char(39).
> Maybe Matlab is getting confused with a string within a string.

Hi,
Thanks Tideman you were correct. I was using '  to escape ' and this has been working fine for my applications except when I tried the conditional if statement and the awk no longer worked. By replacing two apostrophes '' (i.e. char(39)char(39)) with the double quotes " (i.e. char(34)) the awk command works. 

In my search for using awk in Matlab a number of sites recommend using the two apostrophes but the example above suggests it is better to use double quotes.

So for completeness here is the correct code to count the number of unique occurrences in column 2 of a csv file (Filename) given that column 9 must be less than a value of 10:

comm= ['C:\cygwin\bin\gawk -F, "NR!=1  { if ($9<10) exchNo[$2]+=1 } END { for ( id in exchNo ) print id,exchNo[id] } "  '  Filename ];

output=system(comm)

Again Thanks for your help
0
Reply Bruce 2/7/2011 4:06:04 AM

4 Replies
441 Views

(page loaded in 0.076 seconds)

Similiar Articles:













7/21/2012 9:58:25 PM


Reply: