How to process a line without lost it?

  • Follow


Hi all.

I am processing a file line by line. One of the transformation depends
of the existence of a word in the next line. If I obtain the next line
with getline I will going to lost the line (it will not be processed
in the pattern loop) so I am currently doing something like:

   "sed -n " (FNR+1)"p" | getline

Is there any better way?

(the code is in
http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen
line 47)

Cheers
pmarin.
0
Reply pmarin 3/20/2010 9:34:33 PM

pmarin wrote:

> Hi all.
> 
> I am processing a file line by line. One of the transformation depends
> of the existence of a word in the next line. If I obtain the next line
> with getline I will going to lost the line (it will not be processed
> in the pattern loop) so I am currently doing something like:
> 
>    "sed -n " (FNR+1)"p" | getline
> 
> Is there any better way?

It surely depends on what you're trying to do, but for starters keep in mind 
that you can do "getline var" and the next line will be read into "var", 
leaving $0 with the value of the current line.
Then, again depending on your exact goal, it may be possible that the above 
isn't appropriate, or also that you don't even need getline. 

0
Reply pk 3/20/2010 9:34:10 PM


On Mar 20, 10:34=A0pm, pk <p...@pk.invalid> wrote:
> pmarin wrote:
> > Hi all.
>
> > I am processing a file line by line. One of the transformation depends
> > of the existence of a word in the next line. If I obtain the next line
> > with getline I will going to lost the line (it will not be processed
> > in the pattern loop) so I am currently doing something like:
>
> > =A0 =A0"sed -n " (FNR+1)"p" | getline
>
> > Is there any better way?
>
> It surely depends on what you're trying to do, but for starters keep in m=
ind
> that you can do "getline var" and the next line will be read into "var",
> leaving $0 with the value of the current line.
> Then, again depending on your exact goal, it may be possible that the abo=
ve
> isn't appropriate, or also that you don't even need getline.

I doesn work. Note that I need to process the line so It needs to be
$0. The problem is that if the next line is not what I am looking for
I will lost it in the main Loop. The code is:

function class ( class_name, super){
   class_name =3D $3
   "sed -n " (FNR+1)"p" | getline
   if ($1 =3D=3D "superclass") {
   super =3D " : public " $2
   }
   print "class " class_name super " {"
   print "public:"
}
0
Reply pmarin 3/20/2010 9:53:54 PM

On Sat, 20 Mar 2010 14:34:33 -0700 (PDT), pmarin <pacogeek@gmail.com> wrote:

>Hi all.
>
>I am processing a file line by line. One of the transformation depends
>of the existence of a word in the next line. If I obtain the next line
>with getline I will going to lost the line (it will not be processed
>in the pattern loop) so I am currently doing something like:
>
>   "sed -n " (FNR+1)"p" | getline
>
>Is there any better way?

You don't call sed from awk!  Awk is not a shell.

Rather than look ahead to next line, remember the current line.

So first line you store it, rest of lines you have stored previous line 
and current line to work with.  Last thing end of line processing loop 
is last_line = current_line.

Avoid getline.

Grant.
>
>(the code is in
>http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen
>line 47)
>
>Cheers
>pmarin.
0
Reply Grant 3/20/2010 9:55:10 PM

On Mar 20, 10:55=A0pm, Grant <o...@grrr.id.au> wrote:
> On Sat, 20 Mar 2010 14:34:33 -0700 (PDT), pmarin <pacog...@gmail.com> wro=
te:
> >Hi all.
>
> >I am processing a file line by line. One of the transformation depends
> >of the existence of a word in the next line. If I obtain the next line
> >with getline I will going to lost the line (it will not be processed
> >in the pattern loop) so I am currently doing something like:
>
> > =A0 "sed -n " (FNR+1)"p" | getline
>
> >Is there any better way?
>
> You don't call sed from awk! =A0Awk is not a shell.
>
> Rather than look ahead to next line, remember the current line.
>
> So first line you store it, rest of lines you have stored previous line
> and current line to work with. =A0Last thing end of line processing loop
> is last_line =3D current_line.
>
> Avoid getline.
>
> Grant.
>
>
>
> >(the code is in
> >http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen
> >line 47)
>
> >Cheers
> >pmarin.
>
>

It doesn't work well in my script.

The ideal solution should be:

 readline
 if ($1 =3D=3D "superclass") {
    super =3D " : public " $2
 } else {
   FNR--
 }
 Of course I can't do FNR--

0
Reply pmarin 3/20/2010 10:06:42 PM

pmarin wrote:

> On Mar 20, 10:34 pm, pk <p...@pk.invalid> wrote:
>> pmarin wrote:
>> > Hi all.
>>
>> > I am processing a file line by line. One of the transformation depends
>> > of the existence of a word in the next line. If I obtain the next line
>> > with getline I will going to lost the line (it will not be processed
>> > in the pattern loop) so I am currently doing something like:
>>
>> > "sed -n " (FNR+1)"p" | getline
>>
>> > Is there any better way?
>>
>> It surely depends on what you're trying to do, but for starters keep in
>> mind that you can do "getline var" and the next line will be read into
>> "var", leaving $0 with the value of the current line.
>> Then, again depending on your exact goal, it may be possible that the
>> above isn't appropriate, or also that you don't even need getline.
> 
> I doesn work.

That doesn't mean much.

> Note that I need to process the line so It needs to be $0.

No it doesn't. You can use an arbitrary variable and call split() on it.

> The problem is that if the next line is not what I am looking for
> I will lost it in the main Loop. The code is:
> 
> function class ( class_name, super){
>    class_name = $3
>    "sed -n " (FNR+1)"p" | getline
>    if ($1 == "superclass") {
>    super = " : public " $2
>    }
>    print "class " class_name super " {"
>    print "public:"
> }

It would be so much easier if you explained what you're trying to do, and 
what your code is supposed to do in simple terms of "given this input, I 
need to do this and that to get this output".
0
Reply pk 3/20/2010 10:06:55 PM

On Mar 20, 11:06=A0pm, pk <p...@pk.invalid> wrote:
> pmarin wrote:
> > On Mar 20, 10:34 pm, pk <p...@pk.invalid> wrote:
> >> pmarin wrote:
> >> > Hi all.
>
> >> > I am processing a file line by line. One of the transformation depen=
ds
> >> > of the existence of a word in the next line. If I obtain the next li=
ne
> >> > with getline I will going to lost the line (it will not be processed
> >> > in the pattern loop) so I am currently doing something like:
>
> >> > "sed -n " (FNR+1)"p" | getline
>
> >> > Is there any better way?
>
> >> It surely depends on what you're trying to do, but for starters keep i=
n
> >> mind that you can do "getline var" and the next line will be read into
> >> "var", leaving $0 with the value of the current line.
> >> Then, again depending on your exact goal, it may be possible that the
> >> above isn't appropriate, or also that you don't even need getline.
>
> > I doesn work.
>
> That doesn't mean much.
>
> > Note that I need to process the line so It needs to be $0.
>
> No it doesn't. You can use an arbitrary variable and call split() on it.
>
> > The problem is that if the next line is not what I am looking for
> > I will lost it in the main Loop. The code is:
>
> > function class ( class_name, super){
> > =A0 =A0class_name =3D $3
> > =A0 =A0"sed -n " (FNR+1)"p" | getline
> > =A0 =A0if ($1 =3D=3D "superclass") {
> > =A0 =A0super =3D " : public " $2
> > =A0 =A0}
> > =A0 =A0print "class " class_name super " {"
> > =A0 =A0print "public:"
> > }
>
> It would be so much easier if you explained what you're trying to do, and
> what your code is supposed to do in simple terms of "given this input, I
> need to do this and that to get this output".

Ok I will explain  what I want to do.
I want to transform a Tcl source code in to something similar to C++
that can be processed with doxygen. Usually there is a comment section
and then the object to document that can be a class, a function,  a
global variable etc.

A cocumentation of a class in doxygen is the style of:

##
 # This is a class
 #
 #
 class create foo {

}

##
 # this is a a subclass of foo
 #
 class create bar {
    superclass foo
}

And I want to convert it in to:

/**
 * this is a class
 *
 */
class foo {
  puclic:

}

/**
 * this is a subclass of foo
 */
 class bar : public foo {
  public:
}

My code that already works is:
http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen

I want to know if there is an idiom in Awk that let me do what I am
doing in the
function class (line 47)
0
Reply pmarin 3/20/2010 10:30:20 PM

pmarin wrote:

> I want to transform a Tcl source code in to something similar to C++
> that can be processed with doxygen. Usually there is a comment section
> and then the object to document that can be a class, a function,  a
> global variable etc.
> 
> A cocumentation of a class in doxygen is the style of:
> 
> ##
>  # This is a class
>  #
>  #
>  class create foo {
> 
> }
> 
> ##
>  # this is a a subclass of foo
>  #
>  class create bar {
>     superclass foo
> }
> 
> And I want to convert it in to:
> 
> /**
>  * this is a class
>  *
>  */
> class foo {
>   puclic:
> 
> }
> 
> /**
>  * this is a subclass of foo
>  */
>  class bar : public foo {
>   public:
> }

Keep in mind that parsing programming languages based on regular expressions 
is seldom effective, and cannot generally be relied upon (in other words, 
you'd need a real parser that knows the grammar etc.). That said, if the 
source is written in a consistent style, or has been run through an indenter 
or beautifier that make it format reasonably predictable, it's then possible 
to do some simple processing based on regular expressions. I'm going to 
assume that this is the case with your code.

Essentially, you need to call getline to read the next line and see if this 
class has a superclass.

The function class() is called with $0 being the class declaration line.

So all you need to do what you want is this:

function class(   class_name, superclass, temp) {
  class_name = $3

  getline var;
  if (var ~/superclass/) {
    # extract the superclass
    split(var,temp)
    super = ": public " temp[2]
  }
  print "class " class_name super " {"
  print "public:"
}

Alternatively, you could save the information you need from the first $0, 
and call getline:

function class(   class_name, superclass) {
  class_name = $3

  getline
  if ($1 == "superclass") {
    # extract the superclass
    super = ": public " $2
  }
  print "class " class_name super " {"
  print "public:"

}

I really don't see why you need to call sed, given that getline already 
reads the next line. Or there's something else you didn't tell?

getline changes FNR. Ok. Where else are you depending on FNR in your script? 
I don't see it anywhere.
0
Reply pk 3/20/2010 10:43:32 PM

On Mar 20, 11:30=A0pm, pmarin <pacog...@gmail.com> wrote:
> On Mar 20, 11:06=A0pm, pk <p...@pk.invalid> wrote:
>
>
>
> > pmarin wrote:
> > > On Mar 20, 10:34 pm, pk <p...@pk.invalid> wrote:
> > >> pmarin wrote:
> > >> > Hi all.
>
> > >> > I am processing a file line by line. One of the transformation dep=
ends
> > >> > of the existence of a word in the next line. If I obtain the next =
line
> > >> > with getline I will going to lost the line (it will not be process=
ed
> > >> > in the pattern loop) so I am currently doing something like:
>
> > >> > "sed -n " (FNR+1)"p" | getline
>
> > >> > Is there any better way?
>
> > >> It surely depends on what you're trying to do, but for starters keep=
 in
> > >> mind that you can do "getline var" and the next line will be read in=
to
> > >> "var", leaving $0 with the value of the current line.
> > >> Then, again depending on your exact goal, it may be possible that th=
e
> > >> above isn't appropriate, or also that you don't even need getline.
>
> > > I doesn work.
>
> > That doesn't mean much.
>
> > > Note that I need to process the line so It needs to be $0.
>
> > No it doesn't. You can use an arbitrary variable and call split() on it=
..
>
> > > The problem is that if the next line is not what I am looking for
> > > I will lost it in the main Loop. The code is:
>
> > > function class ( class_name, super){
> > > =A0 =A0class_name =3D $3
> > > =A0 =A0"sed -n " (FNR+1)"p" | getline
> > > =A0 =A0if ($1 =3D=3D "superclass") {
> > > =A0 =A0super =3D " : public " $2
> > > =A0 =A0}
> > > =A0 =A0print "class " class_name super " {"
> > > =A0 =A0print "public:"
> > > }
>
> > It would be so much easier if you explained what you're trying to do, a=
nd
> > what your code is supposed to do in simple terms of "given this input, =
I
> > need to do this and that to get this output".
>
> Ok I will explain =A0what I want to do.
> I want to transform a Tcl source code in to something similar to C++
> that can be processed with doxygen. Usually there is a comment section
> and then the object to document that can be a class, a function, =A0a
> global variable etc.
>
> A cocumentation of a class in doxygen is the style of:
>
> ##
> =A0# This is a class
> =A0#
> =A0#
> =A0class create foo {
>
> }
>
> ##
> =A0# this is a a subclass of foo
> =A0#
> =A0class create bar {
> =A0 =A0 superclass foo
>
> }
>
> And I want to convert it in to:
>
> /**
> =A0* this is a class
> =A0*
> =A0*/
> class foo {
> =A0 puclic:
>
> }
>
> /**
> =A0* this is a subclass of foo
> =A0*/
> =A0class bar : public foo {
> =A0 public:
>
> }
>
> My code that already works is:http://github.com/pmarin/tcl2doxygen/blob/m=
aster/tcl2doxygen
>
> I want to know if there is an idiom in Awk that let me do what I am
> doing in the
> function class (line 47)

If I do getline var then FNR will be incremented so It doesn't work in
my script.

  >echo "1\n2\n3" | awk ' { getline var } {print} '
  >1
  >3
0
Reply pmarin 3/20/2010 10:43:53 PM

pmarin wrote:

>> getline changes FNR. Ok. Where else are you depending on FNR in your
>> script? I don't see it anywhere.
> 
> If FNR is incremented the next line in the main look (the pattern
> loop) will be the following of the getline. For example in
> 
> ##
>  # a class
>  #
>  class create foo {
>     ##
>     # A method
>     #
>     method bar {
>     }
> }
> 
> the wrong result will be:
> 
> /**
>  * a class
>  */
>  class foo {
>  }
> 
> The documentation of method is not documented becouse with getline you
> obtain "##" instead of superclass. The main loop looks for lines that
> begin with "##" (line 71) but  FNR has been incremented you will
> obtain "#", so the documentation of the method is skipped.

Ok, I see now (you could have included that sample in your earlier example).
Given that the code is already hackish, I'd say that to keep the same style 
you can do this and make the main loop happy:

function class(   class_name, superclass, old) {
  class_name = $3
  old = $0

  getline
  if ($1 == "superclass") {
    # extract the superclass
    super = ": public " $2
  } else {
    $0 = old        # restore $0
  }
  print "class " class_name super " {"
  print "public:"
}

If I understand correctly, if the next line is a superclass declaration, it 
should be consumed, otherwise it should not. If I'm wrong, just move the 
assignment $0 = old at the end of the function.
0
Reply pk 3/20/2010 11:04:56 PM

On Mar 20, 11:43=A0pm, pk <p...@pk.invalid> wrote:
> pmarin wrote:
> > I want to transform a Tcl source code in to something similar to C++
> > that can be processed with doxygen. Usually there is a comment section
> > and then the object to document that can be a class, a function, =A0a
> > global variable etc.
>
> > A cocumentation of a class in doxygen is the style of:
>
> > ##
> > =A0# This is a class
> > =A0#
> > =A0#
> > =A0class create foo {
>
> > }
>
> > ##
> > =A0# this is a a subclass of foo
> > =A0#
> > =A0class create bar {
> > =A0 =A0 superclass foo
> > }
>
> > And I want to convert it in to:
>
> > /**
> > =A0* this is a class
> > =A0*
> > =A0*/
> > class foo {
> > =A0 puclic:
>
> > }
>
> > /**
> > =A0* this is a subclass of foo
> > =A0*/
> > =A0class bar : public foo {
> > =A0 public:
> > }
>
> Keep in mind that parsing programming languages based on regular expressi=
ons
> is seldom effective, and cannot generally be relied upon (in other words,
> you'd need a real parser that knows the grammar etc.). That said, if the
> source is written in a consistent style, or has been run through an inden=
ter
> or beautifier that make it format reasonably predictable, it's then possi=
ble
> to do some simple processing based on regular expressions. I'm going to
> assume that this is the case with your code.
>
> Essentially, you need to call getline to read the next line and see if th=
is
> class has a superclass.
>
> The function class() is called with $0 being the class declaration line.
>
> So all you need to do what you want is this:
>
> function class( =A0 class_name, superclass, temp) {
> =A0 class_name =3D $3
>
> =A0 getline var;
> =A0 if (var ~/superclass/) {
> =A0 =A0 # extract the superclass
> =A0 =A0 split(var,temp)
> =A0 =A0 super =3D ": public " temp[2]
> =A0 }
> =A0 print "class " class_name super " {"
> =A0 print "public:"
>
> }
>
> Alternatively, you could save the information you need from the first $0,
> and call getline:
>
> function class( =A0 class_name, superclass) {
> =A0 class_name =3D $3
>
> =A0 getline
> =A0 if ($1 =3D=3D "superclass") {
> =A0 =A0 # extract the superclass
> =A0 =A0 super =3D ": public " $2
> =A0 }
> =A0 print "class " class_name super " {"
> =A0 print "public:"
>
> }
>
> I really don't see why you need to call sed, given that getline already
> reads the next line. Or there's something else you didn't tell?
>
> getline changes FNR. Ok. Where else are you depending on FNR in your scri=
pt?
> I don't see it anywhere.

If FNR is incremented the next line in the main look (the pattern
loop) will be the following of the getline. For example in

##
 # a class
 #
 class create foo {
    ##
    # A method
    #
    method bar {
    }
}

the wrong result will be:

/**
 * a class
 */
 class foo {
 }

The documentation of method is not documented becouse with getline you
obtain "##" instead of superclass. The main loop looks for lines that
begin with "##" (line 71) but  FNR has been incremented you will
obtain "#", so the documentation of the method is skipped.
0
Reply pmarin 3/20/2010 11:07:14 PM

pk wrote:

> function class(   class_name, superclass, old) {
>   class_name = $3
>   old = $0
> 
>   getline
>   if ($1 == "superclass") {
>     # extract the superclass
>     super = ": public " $2
>   } else {
>     $0 = old        # restore $0
>   }
>   print "class " class_name super " {"
>   print "public:"
> }
> 
> If I understand correctly, if the next line is a superclass declaration,
> it should be consumed, otherwise it should not. If I'm wrong, just move
> the assignment $0 = old at the end of the function.

Ok that is not going to work as the line is lost anyway when the function 
returns, I hadn't paid too much attention to the main loop (I thought it was 
using getline all the time).

I think the code should be restructured to either not use getline at all, or 
to use it consistently. In other words, either use awk's implicit loop, or 
use getline only.
0
Reply pk 3/20/2010 11:15:40 PM

On Sat, 20 Mar 2010 15:06:42 -0700, pmarin wrote:

> On Mar 20, 10:55 pm, Grant <o...@grrr.id.au> wrote:
>> On Sat, 20 Mar 2010 14:34:33 -0700 (PDT), pmarin <pacog...@gmail.com>
>> wrote:
>> >Hi all.
>>
>> >I am processing a file line by line. One of the transformation depends
>> >of the existence of a word in the next line. If I obtain the next line
>> >with getline I will going to lost the line (it will not be processed
>> >in the pattern loop) so I am currently doing something like:
>>
>> >   "sed -n " (FNR+1)"p" | getline
>>
>> >Is there any better way?
>>
>> You don't call sed from awk!  Awk is not a shell.
>>
>> Rather than look ahead to next line, remember the current line.
>>
>> So first line you store it, rest of lines you have stored previous line
>> and current line to work with.  Last thing end of line processing loop
>> is last_line = current_line.
>>
>> Avoid getline.
>>
>> Grant.
>>
>>
>>
>> >(the code is in
>> >http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen line 47)
>>
>> >Cheers
>> >pmarin.
>>
>>
>>
> It doesn't work well in my script.
> 
> The ideal solution should be:
> 
>  readline
>  if ($1 == "superclass") {
>     super = " : public " $2
>  } else {
>    FNR--
>  }
>  Of course I can't do FNR--

No, the ideal script is one that exploits the capabilities of the 
language.  If you insist of doing it some way that doesn't work within 
the language, then you need to use a different language, even if you have 
to write your own.

One way to do it in awk is illustrated in this test script:

{
	if( $1 ~ /foo/ ) {
		print "Matching line: "$0
		ThisLine = $0
		$0 = PrevLine
		print "$2 of previous line: "$2
		$0 = ThisLine
	}
	print "Previous Line: " PrevLine
	PrevLine = $0
	print "This line: " $0
}

Screen dump:

[tdavis@localhost ~]$ awk -f awktest.awk
xxxx yyyy   
Previous Line: 
This line: xxxx yyyy
foo bar
Matching line: foo bar
$2 of previous line: yyyy
Previous Line: xxxx yyyy
This line: foo bar
^C
[tdavis@localhost ~]$ 

-- 
Ted Davis (tdavis@mst.edu)
0
Reply Ted 3/20/2010 11:20:21 PM

On Mar 21, 12:04=A0am, pk <p...@pk.invalid> wrote:
> pmarin wrote:
> >> getline changes FNR. Ok. Where else are you depending on FNR in your
> >> script? I don't see it anywhere.
>
> > If FNR is incremented the next line in the main look (the pattern
> > loop) will be the following of the getline. For example in
>
> > ##
> > =A0# a class
> > =A0#
> > =A0class create foo {
> > =A0 =A0 ##
> > =A0 =A0 # A method
> > =A0 =A0 #
> > =A0 =A0 method bar {
> > =A0 =A0 }
> > }
>
> > the wrong result will be:
>
> > /**
> > =A0* a class
> > =A0*/
> > =A0class foo {
> > =A0}
>
> > The documentation of method is not documented becouse with getline you
> > obtain "##" instead of superclass. The main loop looks for lines that
> > begin with "##" (line 71) but =A0FNR has been incremented you will
> > obtain "#", so the documentation of the method is skipped.
>
> Ok, I see now (you could have included that sample in your earlier exampl=
e).
> Given that the code is already hackish, I'd say that to keep the same sty=
le
> you can do this and make the main loop happy:
>
> function class( =A0 class_name, superclass, old) {
> =A0 class_name =3D $3
> =A0 old =3D $0
>
> =A0 getline
> =A0 if ($1 =3D=3D "superclass") {
> =A0 =A0 # extract the superclass
> =A0 =A0 super =3D ": public " $2
> =A0 } else {
> =A0 =A0 $0 =3D old =A0 =A0 =A0 =A0# restore $0
> =A0 }
> =A0 print "class " class_name super " {"
> =A0 print "public:"
>
> }
>
> If I understand correctly, if the next line is a superclass declaration, =
it
> should be consumed, otherwise it should not. If I'm wrong, just move the
> assignment $0 =3D old at the end of the function.

I doesn't work becouse "##" is captured in the next loop so $0 will be
initialized with the next line "#". For example:

echo "1\n2\3" | awk '{ old =3D $0; getline; $0 =3D old} {print}'
>1
>3

The line 2 is not processed in the main loop.

0
Reply pmarin 3/20/2010 11:26:30 PM

pmarin wrote:


> I doesn't work becouse "##" is captured in the next loop so $0 will be
> initialized with the next line "#". For example:
> 
> echo "1\n2\3" | awk '{ old = $0; getline; $0 = old} {print}'
>>1
>>3
> 
> The line 2 is not processed in the main loop.

Right, I noticed that after sending the previous message. As I said, you're 
hackishly mixing awk's implicit loop and getline, which leads to trouble. 
Add that you're trying to do a job that should be done by a parser, like 
managing nested comments and code block, and you have a nice picture.

What you'd need here is a sort of "ungetline", which does not exist in awk, 
or (better) restructure your code, or even better use a proper parser.
If your current hack of calling sed to read the next line without changing 
awk's state is good enough for you, I'd say stick to that.
0
Reply pk 3/20/2010 11:26:35 PM

On Mar 21, 12:26=A0am, pmarin <pacog...@gmail.com> wrote:
> On Mar 21, 12:04=A0am, pk <p...@pk.invalid> wrote:
>
>
>
> > pmarin wrote:
> > >> getline changes FNR. Ok. Where else are you depending on FNR in your
> > >> script? I don't see it anywhere.
>
> > > If FNR is incremented the next line in the main look (the pattern
> > > loop) will be the following of the getline. For example in
>
> > > ##
> > > =A0# a class
> > > =A0#
> > > =A0class create foo {
> > > =A0 =A0 ##
> > > =A0 =A0 # A method
> > > =A0 =A0 #
> > > =A0 =A0 method bar {
> > > =A0 =A0 }
> > > }
>
> > > the wrong result will be:
>
> > > /**
> > > =A0* a class
> > > =A0*/
> > > =A0class foo {
> > > =A0}
>
> > > The documentation of method is not documented becouse with getline yo=
u
> > > obtain "##" instead of superclass. The main loop looks for lines that
> > > begin with "##" (line 71) but =A0FNR has been incremented you will
> > > obtain "#", so the documentation of the method is skipped.
>
> > Ok, I see now (you could have included that sample in your earlier exam=
ple).
> > Given that the code is already hackish, I'd say that to keep the same s=
tyle
> > you can do this and make the main loop happy:
>
> > function class( =A0 class_name, superclass, old) {
> > =A0 class_name =3D $3
> > =A0 old =3D $0
>
> > =A0 getline
> > =A0 if ($1 =3D=3D "superclass") {
> > =A0 =A0 # extract the superclass
> > =A0 =A0 super =3D ": public " $2
> > =A0 } else {
> > =A0 =A0 $0 =3D old =A0 =A0 =A0 =A0# restore $0
> > =A0 }
> > =A0 print "class " class_name super " {"
> > =A0 print "public:"
>
> > }
>
> > If I understand correctly, if the next line is a superclass declaration=
, it
> > should be consumed, otherwise it should not. If I'm wrong, just move th=
e
> > assignment $0 =3D old at the end of the function.
>
> I doesn't work becouse "##" is captured in the next loop so $0 will be
> initialized with the next line "#". For example:
>
> echo "1\n2\3" | awk '{ old =3D $0; getline; $0 =3D old} {print}'
>
> >1
> >3
>
> The line 2 is not processed in the main loop.

Note that my script works perfectly (with a standart coding style)
using the sed pipeline
"sed -n " (FNR+1)"p" | getline"
I just wanted to know if there is an Awk indiom for this.

Thanks all.
0
Reply pmarin 3/20/2010 11:35:05 PM

pk wrote:

> pmarin wrote:
> 
> 
>> I doesn't work becouse "##" is captured in the next loop so $0 will be
>> initialized with the next line "#". For example:
>> 
>> echo "1\n2\3" | awk '{ old = $0; getline; $0 = old} {print}'
>>>1
>>>3
>> 
>> The line 2 is not processed in the main loop.
> 
> Right, I noticed that after sending the previous message. As I said,
> you're hackishly mixing awk's implicit loop and getline, which leads to
> trouble. Add that you're trying to do a job that should be done by a
> parser, like managing nested comments and code block, and you have a nice
> picture.
> 
> What you'd need here is a sort of "ungetline", which does not exist in
> awk, or (better) restructure your code, or even better use a proper
> parser. If your current hack of calling sed to read the next line without
> changing awk's state is good enough for you, I'd say stick to that.

To give you some hints, a technique to tackle problems where we depend on 
information that we don't yet know for the current line is to accumulate 
blocks of text (also collecting information as you go along), and when 
you've reached the end, and have all the information you need, format the 
block how it should be and finally print it at once.

For example, simplifying, you have either

class bar {
   superclass foo
}

that should become

class bar : public foo {
  public:
}

or you can also have

class bar {

}

and that should become

class bar {
  public:
}

We can read the whole block and save the relevant information (ie class name 
here - there can be more), and at the same time note whether there's a 
superclass line or not in the block. When we reach the end, we print the 
right output based on the complete information we have gathered.

Something like this:

awk '/^class/{super=""; class_name=$1; next}
     $1 == "superclass" {super = $2; next}
     /^}$/{
       print "class " class_name (super?"":" : public " super) " {"
       print "  public;"
       print "}"
     }'

However, you immediately see that the above code makes a number of very 
fragile assumptions, first and foremost where a block begins and end. A 
misplaced brace, and it fails.

In your case, it's all made more complicated by the fact that you can have 
different types of objects (not only classes), you have comments and you can 
have nested blocks, which makes it even harder.

With a lot of effort and a whole load of assumptions you could probably 
still write something that mostly does the job, but if you did it correctly 
and robustly, you'd find that what you wrote would not look very different 
from a real parser.
0
Reply pk 3/20/2010 11:57:45 PM

"pmarin" <pacogeek@gmail.com> wrote in message 
news:2655baef-8a77-4470-b6bc-7d62948a20f9@k17g2000yqb.googlegroups.com...
On Mar 20, 11:06 pm, pk <p...@pk.invalid> wrote:
> pmarin wrote:

> A cocumentation of a class in doxygen is the style of:

> ##
>  # This is a class
> #
 > #
 > class create foo {

> }

> ##
 > # this is a a subclass of foo
>  #
> class create bar {
>    superclass foo
> }

> And I want to convert it in to:

> /**
> * this is a class
> *
> */
> class foo {
 >  puclic:

> }

> /**
> * this is a subclass of foo
> */
> class bar : public foo {
>  public:
> }

I haven't looked at your original script, but just based on this:

----------------------------

$1 ~ /^#+$/ { $0 = "/* " substr($0, RSTART+RLENGTH) " */" }

$1 == "class" { name = $3; next }

name {
     if ( $1 == "superclass" )
       print "class " name " : public " $2 " {"
    else
       print "class " name " {"
    name = ""
    $0 = "  public:"
}

{ print }

-----------------------

I suppose there's a lot more to it, and this makes every "#" line into its 
own comment (I didn't want to fiddle with watching for a "#/not#" change, 
and compilers shouldn't mind).
But it pretty much does what you outline here, I think (untested).

- Anton Treuenfels 

0
Reply Anton 3/21/2010 1:51:04 AM

pmarin wrote:
> On Mar 20, 11:06 pm, pk <p...@pk.invalid> wrote:
>> pmarin wrote:
>>> On Mar 20, 10:34 pm, pk <p...@pk.invalid> wrote:
>>>> pmarin wrote:
>>>>> Hi all.
>>>>> I am processing a file line by line. One of the transformation depends
>>>>> of the existence of a word in the next line. If I obtain the next line
>>>>> with getline I will going to lost the line (it will not be processed
>>>>> in the pattern loop) so I am currently doing something like:
>>>>> "sed -n " (FNR+1)"p" | getline
>>>>> Is there any better way?
>>>> It surely depends on what you're trying to do, but for starters keep in
>>>> mind that you can do "getline var" and the next line will be read into
>>>> "var", leaving $0 with the value of the current line.
>>>> Then, again depending on your exact goal, it may be possible that the
>>>> above isn't appropriate, or also that you don't even need getline.
>>> I doesn work.
>> That doesn't mean much.
>>
>>> Note that I need to process the line so It needs to be $0.
>> No it doesn't. You can use an arbitrary variable and call split() on it.
>>
>>> The problem is that if the next line is not what I am looking for
>>> I will lost it in the main Loop. The code is:
>>> function class ( class_name, super){
>>>    class_name = $3
>>>    "sed -n " (FNR+1)"p" | getline
>>>    if ($1 == "superclass") {
>>>    super = " : public " $2
>>>    }
>>>    print "class " class_name super " {"
>>>    print "public:"
>>> }
>> It would be so much easier if you explained what you're trying to do, and
>> what your code is supposed to do in simple terms of "given this input, I
>> need to do this and that to get this output".
> 
> Ok I will explain  what I want to do.
> I want to transform a Tcl source code in to something similar to C++
> that can be processed with doxygen. Usually there is a comment section
> and then the object to document that can be a class, a function,  a
> global variable etc.
> 
> A cocumentation of a class in doxygen is the style of:
> 
> ##
>  # This is a class
>  #
>  #
>  class create foo {
> 
> }
> 
> ##
>  # this is a a subclass of foo
>  #
>  class create bar {
>     superclass foo
> }
> 
> And I want to convert it in to:
> 
> /**
>  * this is a class
>  *
>  */
> class foo {
>   puclic:
> 
> }
> 
> /**
>  * this is a subclass of foo
>  */
>  class bar : public foo {
>   public:
> }
> 
> My code that already works is:
> http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen
> 
> I want to know if there is an idiom in Awk that let me do what I am
> doing in the
> function class (line 47)

What you want is a delayed processing for class headers.

 $1 ~ /^class$/ { delayed = 1 ; line = $0 ; next }

 delayed && $1 !~ /^superclass$/ { delayed = 0 ; do_1(line,$0) ; next }

 delayed && $1 ~ /^superclass$/ { delayed = 0 ; do_2(line,$0) ; next }

 { here goes the non-specific handling of other lines }

I've used do_1() and do_2() to mark the locations where you'll do your
processing; do_2() will construct the  class : public superclass  from
the stored 'line' and the /superclass/ ionformation in $0, while do_1()
will create a non derived class header from variable 'line' and a
separate  processing for the current line that is available in $0.

Janis
0
Reply Janis 3/21/2010 8:13:59 AM

Sat, 20 Mar 2010 15:06:42 -0700, pmarin did cat :

> On Mar 20, 10:55 pm, Grant <o...@grrr.id.au> wrote:
>> On Sat, 20 Mar 2010 14:34:33 -0700 (PDT), pmarin <pacog...@gmail.com>
>> wrote:
>> >Hi all.
>>
>> >I am processing a file line by line. One of the transformation depends
>> >of the existence of a word in the next line. If I obtain the next line
>> >with getline I will going to lost the line (it will not be processed
>> >in the pattern loop) so I am currently doing something like:
>>
>> >   "sed -n " (FNR+1)"p" | getline
>>
>> >Is there any better way?
>>
>> You don't call sed from awk!  Awk is not a shell.
>>
>> Rather than look ahead to next line, remember the current line.
>>
>> So first line you store it, rest of lines you have stored previous line
>> and current line to work with.  Last thing end of line processing loop
>> is last_line = current_line.
>>
>> Avoid getline.
>>
>> Grant.
>>
>>
>>
>> >(the code is in
>> >http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen line 47)
>>
>> >Cheers
>> >pmarin.
>>
>>
>>
> It doesn't work well in my script.
> 
> The ideal solution should be:
> 
>  readline
>  if ($1 == "superclass") {
>     super = " : public " $2
>  } else {
>    FNR--
>  }
>  Of course I can't do FNR--

Here's the most "awkish" and clean way I can think, based on the starting
point that the code you have with the getline works for you
(I precise this as it doesn't for me, maybe just a cutpaste problem? *)
So, based on that lemma I'd use a pre-gobber, here's the diff
with your code found at:
http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen

 the 'tee' feeds the awk code two times, the firt one
is simply "gobbed" in 'mem' array (preindex -1 as it's constant)
next pass is your original code except the sed|getline is replaced by
the call to the prefetched array element:
-----------
$ diff -U 1 tc2do.sh.orig tc2do.sh
--- tc2do.sh.orig       2010-03-21 10:36:11.682902586 +0100
+++ tc2do.sh    2010-03-21 10:36:35.394150204 +0100
@@ -11,4 +11,4 @@
 print
-} ' | awk '
- 
+} ' | tee - | awk '
+NR==FNR{mem[FNR-1]=$0}
 function arg_list ( i, arg, val, args) {
@@ -46,3 +46,3 @@
        class_name = $3
-       "sed -n " (FNR+1)"p" | getline
+       $0=mem[FNR]
        if ($1 == "superclass") {
@@ -99,3 +99,3 @@
 if ($2 == ";")
-print "};" }'
+print "};" }' - -
-----------

(*) just a annex as your code works for you :-)
but on my place your original code on your posted example gives only the
partial output of comments, no "do code"!, i-e:
-----------
/**
 * This is a class
 *
 *
*/

/**
 * this is a a subclass of foo
 *
*/
-----------
0
Reply Loki 3/21/2010 9:47:24 AM

On Mar 21, 9:13=A0am, Janis Papanagnou <janis_papanag...@hotmail.com>
wrote:
> pmarin wrote:
> > On Mar 20, 11:06 pm, pk <p...@pk.invalid> wrote:
> >> pmarin wrote:
> >>> On Mar 20, 10:34 pm, pk <p...@pk.invalid> wrote:
> >>>> pmarin wrote:
> >>>>> Hi all.
> >>>>> I am processing a file line by line. One of the transformation depe=
nds
> >>>>> of the existence of a word in the next line. If I obtain the next l=
ine
> >>>>> with getline I will going to lost the line (it will not be processe=
d
> >>>>> in the pattern loop) so I am currently doing something like:
> >>>>> "sed -n " (FNR+1)"p" | getline
> >>>>> Is there any better way?
> >>>> It surely depends on what you're trying to do, but for starters keep=
 in
> >>>> mind that you can do "getline var" and the next line will be read in=
to
> >>>> "var", leaving $0 with the value of the current line.
> >>>> Then, again depending on your exact goal, it may be possible that th=
e
> >>>> above isn't appropriate, or also that you don't even need getline.
> >>> I doesn work.
> >> That doesn't mean much.
>
> >>> Note that I need to process the line so It needs to be $0.
> >> No it doesn't. You can use an arbitrary variable and call split() on i=
t.
>
> >>> The problem is that if the next line is not what I am looking for
> >>> I will lost it in the main Loop. The code is:
> >>> function class ( class_name, super){
> >>> =A0 =A0class_name =3D $3
> >>> =A0 =A0"sed -n " (FNR+1)"p" | getline
> >>> =A0 =A0if ($1 =3D=3D "superclass") {
> >>> =A0 =A0super =3D " : public " $2
> >>> =A0 =A0}
> >>> =A0 =A0print "class " class_name super " {"
> >>> =A0 =A0print "public:"
> >>> }
> >> It would be so much easier if you explained what you're trying to do, =
and
> >> what your code is supposed to do in simple terms of "given this input,=
 I
> >> need to do this and that to get this output".
>
> > Ok I will explain =A0what I want to do.
> > I want to transform a Tcl source code in to something similar to C++
> > that can be processed with doxygen. Usually there is a comment section
> > and then the object to document that can be a class, a function, =A0a
> > global variable etc.
>
> > A cocumentation of a class in doxygen is the style of:
>
> > ##
> > =A0# This is a class
> > =A0#
> > =A0#
> > =A0class create foo {
>
> > }
>
> > ##
> > =A0# this is a a subclass of foo
> > =A0#
> > =A0class create bar {
> > =A0 =A0 superclass foo
> > }
>
> > And I want to convert it in to:
>
> > /**
> > =A0* this is a class
> > =A0*
> > =A0*/
> > class foo {
> > =A0 puclic:
>
> > }
>
> > /**
> > =A0* this is a subclass of foo
> > =A0*/
> > =A0class bar : public foo {
> > =A0 public:
> > }
>
> > My code that already works is:
> >http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen
>
> > I want to know if there is an idiom in Awk that let me do what I am
> > doing in the
> > function class (line 47)
>
> What you want is a delayed processing for class headers.
>
> =A0$1 ~ /^class$/ { delayed =3D 1 ; line =3D $0 ; next }
>
> =A0delayed && $1 !~ /^superclass$/ { delayed =3D 0 ; do_1(line,$0) ; next=
 }
>
> =A0delayed && $1 ~ /^superclass$/ { delayed =3D 0 ; do_2(line,$0) ; next =
}
>
> =A0{ here goes the non-specific handling of other lines }
>
> I've used do_1() and do_2() to mark the locations where you'll do your
> processing; do_2() will construct the =A0class : public superclass =A0fro=
m
> the stored 'line' and the /superclass/ ionformation in $0, while do_1()
> will create a non derived class header from variable 'line' and a
> separate =A0processing for the current line that is available in $0.
>
> Janis

The Janis's solution works, I have removed the second next:

 $1 =3D=3D "class" { delayed =3D 1 ; line =3D $0 ; next }

 delayed && $1 !=3D "superclass" { delayed =3D 0 ; do_1(line,$0)}

 delayed && $1 =3D=3D "superclass" { delayed =3D 0 ; do_2(line,$0); next}

 { here goes the non-specific handling of other lines }

Thank you.
0
Reply pmarin 3/21/2010 10:56:32 AM

On Mar 21, 10:47=A0am, Loki Harfagr <l...@thedarkdesign.free.fr.INVALID>
wrote:
> Sat, 20 Mar 2010 15:06:42 -0700, pmarin did cat=A0:
>
>
>
> > On Mar 20, 10:55=A0pm, Grant <o...@grrr.id.au> wrote:
> >> On Sat, 20 Mar 2010 14:34:33 -0700 (PDT), pmarin <pacog...@gmail.com>
> >> wrote:
> >> >Hi all.
>
> >> >I am processing a file line by line. One of the transformation depend=
s
> >> >of the existence of a word in the next line. If I obtain the next lin=
e
> >> >with getline I will going to lost the line (it will not be processed
> >> >in the pattern loop) so I am currently doing something like:
>
> >> > =A0 "sed -n " (FNR+1)"p" | getline
>
> >> >Is there any better way?
>
> >> You don't call sed from awk! =A0Awk is not a shell.
>
> >> Rather than look ahead to next line, remember the current line.
>
> >> So first line you store it, rest of lines you have stored previous lin=
e
> >> and current line to work with. =A0Last thing end of line processing lo=
op
> >> is last_line =3D current_line.
>
> >> Avoid getline.
>
> >> Grant.
>
> >> >(the code is in
> >> >http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygenline 47)
>
> >> >Cheers
> >> >pmarin.
>
> > It doesn't work well in my script.
>
> > The ideal solution should be:
>
> > =A0readline
> > =A0if ($1 =3D=3D "superclass") {
> > =A0 =A0 super =3D " : public " $2
> > =A0} else {
> > =A0 =A0FNR--
> > =A0}
> > =A0Of course I can't do FNR--
>
> Here's the most "awkish" and clean way I can think, based on the starting
> point that the code you have with the getline works for you
> (I precise this as it doesn't for me, maybe just a cutpaste problem? *)
> So, based on that lemma I'd use a pre-gobber, here's the diff
> with your code found at:http://github.com/pmarin/tcl2doxygen/blob/master/=
tcl2doxygen
>
> =A0the 'tee' feeds the awk code two times, the firt one
> is simply "gobbed" in 'mem' array (preindex -1 as it's constant)
> next pass is your original code except the sed|getline is replaced by
> the call to the prefetched array element:
> -----------
> $ diff -U 1 tc2do.sh.orig tc2do.sh
> --- tc2do.sh.orig =A0 =A0 =A0 2010-03-21 10:36:11.682902586 +0100
> +++ tc2do.sh =A0 =A02010-03-21 10:36:35.394150204 +0100
> @@ -11,4 +11,4 @@
> =A0print
> -} ' | awk '
> -
> +} ' | tee - | awk '
> +NR=3D=3DFNR{mem[FNR-1]=3D$0}
> =A0function arg_list ( i, arg, val, args) {
> @@ -46,3 +46,3 @@
> =A0 =A0 =A0 =A0 class_name =3D $3
> - =A0 =A0 =A0 "sed -n " (FNR+1)"p" | getline
> + =A0 =A0 =A0 $0=3Dmem[FNR]
> =A0 =A0 =A0 =A0 if ($1 =3D=3D "superclass") {
> @@ -99,3 +99,3 @@
> =A0if ($2 =3D=3D ";")
> -print "};" }'
> +print "};" }' - -
> -----------
>
> (*) just a annex as your code works for you :-)
> but on my place your original code on your posted example gives only the
> partial output of comments, no "do code"!, i-e:
> -----------
> /**
> =A0* This is a class
> =A0*
> =A0*
> */
>
> /**
> =A0* this is a a subclass of foo
> =A0*
> */
> -----------

Loki: Sorry, Its "oo::class" instead of "class".
0
Reply pmarin 3/21/2010 11:06:33 AM

Sun, 21 Mar 2010 04:06:33 -0700, pmarin did cat :

> On Mar 21, 10:47 am, Loki Harfagr <l...@thedarkdesign.free.fr.INVALID>
> wrote:
>> Sat, 20 Mar 2010 15:06:42 -0700, pmarin did cat :
>>
>>
>>
>> > On Mar 20, 10:55 pm, Grant <o...@grrr.id.au> wrote:
>> >> On Sat, 20 Mar 2010 14:34:33 -0700 (PDT), pmarin
>> >> <pacog...@gmail.com> wrote:
>> >> >Hi all.
>>
>> >> >I am processing a file line by line. One of the transformation
>> >> >depends of the existence of a word in the next line. If I obtain
>> >> >the next line with getline I will going to lost the line (it will
>> >> >not be processed in the pattern loop) so I am currently doing
>> >> >something like:
>>
>> >> >   "sed -n " (FNR+1)"p" | getline
>>
>> >> >Is there any better way?
>>
>> >> You don't call sed from awk!  Awk is not a shell.
>>
>> >> Rather than look ahead to next line, remember the current line.
>>
>> >> So first line you store it, rest of lines you have stored previous
>> >> line and current line to work with.  Last thing end of line
>> >> processing loop is last_line = current_line.
>>
>> >> Avoid getline.
>>
>> >> Grant.
>>
>> >> >(the code is in
>> >> >http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygenline
>> >> >47)
>>
>> >> >Cheers
>> >> >pmarin.
>>
>> > It doesn't work well in my script.
>>
>> > The ideal solution should be:
>>
>> >  readline
>> >  if ($1 == "superclass") {
>> >     super = " : public " $2
>> >  } else {
>> >    FNR--
>> >  }
>> >  Of course I can't do FNR--
>>
>> Here's the most "awkish" and clean way I can think, based on the
>> starting point that the code you have with the getline works for you (I
>> precise this as it doesn't for me, maybe just a cutpaste problem? *)
>> So, based on that lemma I'd use a pre-gobber, here's the diff with your
>> code found
>> at:http://github.com/pmarin/tcl2doxygen/blob/master/tcl2doxygen
>>
>>  the 'tee' feeds the awk code two times, the firt one
>> is simply "gobbed" in 'mem' array (preindex -1 as it's constant) next
>> pass is your original code except the sed|getline is replaced by the
>> call to the prefetched array element: -----------
>> $ diff -U 1 tc2do.sh.orig tc2do.sh
>> --- tc2do.sh.orig       2010-03-21 10:36:11.682902586 +0100 +++
>> tc2do.sh    2010-03-21 10:36:35.394150204 +0100 @@ -11,4 +11,4 @@
>>  print
>> -} ' | awk '
>> -
>> +} ' | tee - | awk '
>> +NR==FNR{mem[FNR-1]=$0}
>>  function arg_list ( i, arg, val, args) {
>> @@ -46,3 +46,3 @@
>>         class_name = $3
>> -       "sed -n " (FNR+1)"p" | getline +       $0=mem[FNR]
>>         if ($1 == "superclass") {
>> @@ -99,3 +99,3 @@
>>  if ($2 == ";")
>> -print "};" }'
>> +print "};" }' - -
>> -----------
>>
>> (*) just a annex as your code works for you :-) but on my place your
>> original code on your posted example gives only the partial output of
>> comments, no "do code"!, i-e: -----------
>> /**
>>  * This is a class
>>  *
>>  *
>> */
>>
>> /**
>>  * this is a a subclass of foo
>>  *
>> */
>> -----------
> 
> Loki: Sorry, Its "oo::class" instead of "class".

OK, and anyway the 'tee' idea seems to bee seen as one flat file
so it wouldn't work without temp fifos or files, here's a
(tested working) quick version with temp files:
------------
#! /bin/sh
 
# Author: Francisco José Marín Pérez (pmarin.mail at gmail.com)
# See LICENSE for copyright and license details.
# Main webpage: http://github.com/pmarin/tcl2doxygen
touch /dev/shm/ff{1,2} && rm /dev/shm/ff{1,2} && touch /dev/shm/ff{1,2} || exit
cat $1 | awk '
/^$/ { next }
{
        gsub("{", " { ")
        gsub("}", " } ")
        print
} ' | tee  /dev/shm/ff1 > /dev/shm/ff2
awk '
NR==FNR{mem[FNR-1]=$0;next}
function arg_list ( i, arg, val, args) {
i = 4
while (i <= (NF -2)){
if ($i != "{") {
args = args "type " $i
} else {
arg = $++i
val = $++i
if (val != "}") {
args = args " optional " arg " = " val
i++
} else {
args = args " optional " arg
}
}
if (i <= (NF -3)){
args = args ","
}
i++
}
return args
print "los argumentos son: " args
}
 
function proc ( args, progname){
proc_name = $2
args = arg_list()
print "string " proc_name " (" args "){"
print "\n}"
}
 
function class ( class_name, super){
        class_name = $3
        $0=mem[FNR]
        if ($1 == "superclass") {
                super = " : public " $2
        }
        print "class " class_name super " {"
        print "public:"
}
 
function method ( args, progname){
proc_name = $2
args = arg_list()
print "string " proc_name " (" args ");"
}
 
function constructor ( args, progname){
proc_name = $1
args = arg_list()
print proc_name " (" args ");"
}
 
function set ( ) {
print "type " $2 " = " $3 ";"
}
 
$1 == "##" {
print "/**"
getline
while ($1 == "#") {
$1 = " *"
print
getline
}
print "*/"
if ($1 == "proc") {
proc()
} else if ($1 == "set") {
set()
} else if ($1 == "oo::class") {
class();
} else if ($1 == "method") {
method()
} else if ($1 == "constructor") {
constructor()
} else if ($1 == "destructor") {
print "destructor();"
} else if ($1 == "namespace") {
print "namespace " $3 " {"
}
print ""
}
 
$1 == "}" {
        if ($2 == ";")
print "};" }' /dev/shm/ff1 /dev/shm/ff2
rm /dev/shm/ff{1,2}
 
# BUGS

------------

0
Reply Loki 3/21/2010 11:38:09 AM

Here is my solution: it is far from being elegant, nor shortest, but
it is straight-forward and (hopefully) easy to understand. My
strategy: when the line contain "class create", do not process right
away. Instead, save the name of the class so we can process when the
next line comes.

#
----------------------------------------------------------------------
# Pattern: Beginning of the class comment
# Action: Save "/**" as the first comment line, then skip all further
# processing via the "next" statement.
/^## *$/ {
    count = 0;
    comments[count++] = "/**"
    next
}

# Pattern: A space and comment char
# Action: Convert the " # blah..." to " * blah..." and save it in the
# comments[] array, then skip all further processing
/^ #/ {
    sub(/^ #/, " *")
    comments[count++] = $0
    next
}

# Pattern: Line which contains the class declaration
# Action: Prints out all comments so far and save the name of the
class,
# then skip all further processing
/class create/ {
    for (i = 0; i < count; i++) {
        print comments[i]
    }
    print " */"

    # Don't print out the class declaration yet. Defer it until the
next
    # line is process. Just save the class name for now.
    className = $3;
    next
}

# Pattern: "superclass"
# Action: Print the class and its super class, reset the class name,
# then skip all further processing
$1 == "superclass" {
    print "class", className, ": public", $2, "{"
    className = ""
    next
}

# Pattern: The line after the "class" line, but does not contain
# superclass
# Action: print out the class declaration
className != "" {
    print "class", className, "{"
    className = ""
    next
}

# Pattern: Any other lines
# Action: print
1

#
----------------------------------------------------------------------
0
Reply Hai 4/27/2010 3:04:34 AM

A couple of years ago, I was looking for a document solution for my
Tcl scripts & libraries. I also faced the same problem. Then I ran
across tcl-dox. Besides the inability to handle class declarations,
tcl-dox generates documents which show function prototypes in C++
instead of in Tcl. At which point, I dropped the idea of using tcl-dox
altogether and went with Natural Docs (http://naturaldocs.org/) and
lived happily ever after.
0
Reply Hai 4/27/2010 3:13:13 AM

24 Replies
115 Views

(page loaded in 0.092 seconds)

Similiar Articles:


















7/28/2012 8:26:32 PM


Reply: