|
|
AWK internal design
I think that AWK, being one of the fastest (if not THE fastest) generic
text-processing tools on earth, gives an example of how good
engineering plus careful programming and profiling can produce
outstanding results.
I am interested in efficient text processing (gigabytes of text),
mostly indexing and matching.
I was wondering if I can learn the efficient processing techniques from
the awk experience. It is possible (but hard) to do it from the source
code, but I was wondering if there are any documents (articles, blogs,
group posts) which explain how AWK is designed internally.
Do you know of any such documents?
Thanks.
Denis
|
|
1
|
|
|
|
Reply
|
denis.mikhalkin (1)
|
6/16/2006 5:07:10 AM |
|
denis.mikhalkin@csiro.au wrote:
> I think that AWK, being one of the fastest (if not THE fastest) generic
> text-processing tools on earth, gives an example of how good
> engineering plus careful programming and profiling can produce
> outstanding results.
There are many different implementations and
these implementation differ significantly in
their efficiency.
> I was wondering if I can learn the efficient processing techniques from
> the awk experience. It is possible (but hard) to do it from the source
> code, but I was wondering if there are any documents (articles, blogs,
> group posts) which explain how AWK is designed internally.
>
> Do you know of any such documents?
The only doc I know is Arnold Robbins' GAWK manual.
But the chapter on internals is rather short:
http://www.gnu.org/software/gawk/manual/html_node/Internals.html#Internals
There are two projects trying to re-implement AWK's
execution model in Java. Maybe you can learn more
from their source code documentation:
http://jawk.sourceforge.net/
http://jakarta.apache.org/oro/api/org/apache/oro/text/awk/AwkCompiler.html
|
|
0
|
|
|
|
Reply
|
ISO
|
6/16/2006 6:47:09 PM
|
|
J?rgen Kahrs <Juergen.KahrsDELETETHIS@vr-web.de> wrote:
> denis.mikhalkin@csiro.au wrote:
>
>> I think that AWK, being one of the fastest (if not THE fastest) generic
>> text-processing tools on earth, gives an example of how good
>> engineering plus careful programming and profiling can produce
>> outstanding results.
>
> There are many different implementations and
> these implementation differ significantly in
> their efficiency.
What is the fastest version of awk extant?
Thanks,
Dave Feustel
--
Using OpenBSD with or without X & KDE?
See Dave's OpenBSD | X | KDE corner at
http://dfeustel.home.mindspring.com !!!
|
|
0
|
|
|
|
Reply
|
dfeustel
|
6/16/2006 8:32:32 PM
|
|
dfeustel@mindspring.com wrote:
> J?rgen Kahrs <Juergen.KahrsDELETETHIS@vr-web.de> wrote:
> > denis.mikhalkin@csiro.au wrote:
> >
> >> I think that AWK, being one of the fastest (if not THE fastest) generic
> >> text-processing tools on earth, gives an example of how good
> >> engineering plus careful programming and profiling can produce
> >> outstanding results.
> >
> > There are many different implementations and
> > these implementation differ significantly in
> > their efficiency.
>
> What is the fastest version of awk extant?
>
> Thanks,
> Dave Feustel
> --
> Using OpenBSD with or without X & KDE?
> See Dave's OpenBSD | X | KDE corner at
> http://dfeustel.home.mindspring.com !!!
Besides tawk (which is not free), mawk is the fastest.
|
|
0
|
|
|
|
Reply
|
William
|
6/16/2006 8:51:26 PM
|
|
William James <w_a_x_man@yahoo.com> wrote:
>
> dfeustel@mindspring.com wrote:
>> J?rgen Kahrs <Juergen.KahrsDELETETHIS@vr-web.de> wrote:
>> > denis.mikhalkin@csiro.au wrote:
>> >
>> >> I think that AWK, being one of the fastest (if not THE fastest) generic
>> >> text-processing tools on earth, gives an example of how good
>> >> engineering plus careful programming and profiling can produce
>> >> outstanding results.
>> >
>> > There are many different implementations and
>> > these implementation differ significantly in
>> > their efficiency.
>>
>> What is the fastest version of awk extant?
>>
>> Thanks,
>> Dave Feustel
>> --
>> Using OpenBSD with or without X & KDE?
>> See Dave's OpenBSD | X | KDE corner at
>> http://dfeustel.home.mindspring.com !!!
>
> Besides tawk (which is not free), mawk is the fastest.
>
Mawk is on my system. Thanks!
--
Using OpenBSD with or without X & KDE?
See Dave's OpenBSD | X | KDE corner at
http://dfeustel.home.mindspring.com !!!
|
|
0
|
|
|
|
Reply
|
dfeustel
|
6/16/2006 10:27:15 PM
|
|
J=FCrgen Kahrs wrote:
> denis.mikhalkin@csiro.au wrote:
>
> > I think that AWK, being one of the fastest (if not THE fastest) generic
> > text-processing tools on earth, gives an example of how good
> > engineering plus careful programming and profiling can produce
> > outstanding results.
>
> There are many different implementations and
> these implementation differ significantly in
> their efficiency.
>
> > I was wondering if I can learn the efficient processing techniques from
> > the awk experience. It is possible (but hard) to do it from the source
> > code, but I was wondering if there are any documents (articles, blogs,
> > group posts) which explain how AWK is designed internally.
> >
> > Do you know of any such documents?
>
> The only doc I know is Arnold Robbins' GAWK manual.
> But the chapter on internals is rather short:
>
> http://www.gnu.org/software/gawk/manual/html_node/Internals.html#Intern=
als
>
> There are two projects trying to re-implement AWK's
> execution model in Java. Maybe you can learn more
> from their source code documentation:
>
> http://jawk.sourceforge.net/
> http://jakarta.apache.org/oro/api/org/apache/oro/text/awk/AwkCompiler.h=
tml
Alright, thank you for the pointers.
Denis
|
|
0
|
|
|
|
Reply
|
denis
|
6/19/2006 10:00:27 AM
|
|
|
5 Replies
126 Views
(page loaded in 0.058 seconds)
|
|
|
|
|
|
|
|
|