awk 'BEGIN { print "hello, word" }'
I am pleased to announce the availability of a test version of gawk.
This version uses a byte-code execution engine, and most importantly,
it includes a debugger that works at the level of awk statements! The
distribution is available at
http://www.skeeve.com/gawk/gawk-3.1.7-bc-d.tar.gz
This version is the same as 3.1.7, but with a new execution engine and
a debugging version of gawk named, rather imaginatively, "dgawk".
There is a story here. Circa 2003, a gentleman by the name of Jon Haque
developed the byte-code execution engine and debugger, in the context
of a development gawk version, somewhere between 3.1.3 and 3.1.4.
I never integrated the changes as they were massive and I was busy,
and I wasn't able to review them.
The changes languished, and Jon disappeared.
Last fall, Stephen Davies, one of my portability team members, agreed to
take on the task of bringing the code into the present. With modest help
from me, he succeeded. We then went through additional work to get this
version portable to some of the more esoteric systems that gawk supports
(64 bit Linux, z/OS and VMS).
I thought it was ready for release at the end of December, until another
one of my testers found a severe memory leak in the byte code version.
It was a bear to track down, and once again Stephen came through.
The debugger uses the readline library, and it is purposely similar
to GDB. There is only minimal documentation on the debugger; I'd love
to have someone volunteer to write a chapter for the gawk manual that
explains it fully.
In addition, this version supports a file-inclusion mechanism, but it
is quite likely to change once I review the awk.info poll results and
make a final decision (within a week, I hope!).
Per the gawk roadmap I announced a while back, the plan is as follows:
1. Make a 3.1.8 release based on the original execution engine so that
OS distributions have a stable version to ship.
2. Merge the 3.1.8 changes into the 'gawk-devel' tree.
3. Merge the byte-code changes into 3.1.8 to give us an up-to-date
set of byte-code changes.
4. Merge the byte-code changes into the devel tree. Development will
then continue in the devel tree towards a 4.0 release.
PLEASE try this out on your programs. Please also play with the
debugger.
If you think this is a major step forward, send Stephen a note thanking
him - his email address is in the dist. (I don't want to put it into
the news article so he won't get spam.)
If you are interested in contributing, please contact me directly, not
by posting in the news group.
Thanks!
Arnold
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381
Nof Ayalon Cell Phone: +972 50 729-7545
D.N. Shimshon 99785 ISRAEL
|
|
0
|
|
|
|
Reply
|
arnold
|
2/4/2010 6:45:16 AM |
|
On Feb 4, 1:45=A0am, arn...@skeeve.com (Aharon Robbins) wrote:
> =A0 =A0 =A0 =A0 awk 'BEGIN { print "hello, word" }'
>
> I am pleased to announce the availability of a test version of gawk.
> This version uses a byte-code execution engine, and most importantly,
> it includes a debugger that works at the level of awk statements!
great news. very exciting!
any comments on the performance hit/improvement of running as byte
codes?
t
|
|
0
|
|
|
|
Reply
|
Tim
|
2/4/2010 1:32:58 PM
|
|
Aharon Robbins <arnold@skeeve.com> wrote:
: [...]
: PLEASE try this out on your programs. Please also play with the
: debugger.
: [...]
Hmmm...
so I built gawk-3.1.7-bc-d on FreeBSD6.0/i386, and after two abortive
tries, both of which had me quite close to sending error reports (would
it hurt to make --enable-switch the default by now?), it seems to work.
Nicely done.
Since the code I tested this on with some frustration turned out to be
fine in the end, I thought I'd share:
-----isbn.awk
# isbn.awk -- ISBN checksum and conversion functions
#
# Sebastian F. Mix, charon@furry.de, Public Domain
# November 2009
BEGIN{
isbn_errbase=10
}
#strip unnecessary syntactic sugar and whitespace before computation
function isbn_sanitize(s){
gsub("-","",s);gsub(" ","",s);gsub("\t","",s);gsub("x","X",s)
return s
}
#compute ISBN-10 checksum
#/^[0-9]{9}/ --> /^[0-9X]$/
function isbn_chksum10(s, i, cs, wgt){
wgt=10
for(i=1;i<10;i++)
cs+=strtonum(substr(s,i,1))*wgt--
cs=11-cs%11
return (cs==10)?"X":((cs==11)?"0":cs "")
}
#compute ISBN-13 checksum
#/^[0-9]{12}/ --> /^[0-9X]$/
function isbn_chksum13(s, i, cs){
for(i=1;i<13;i++)
cs+=strtonum(substr(s,i,1))*(i%2?1:3)
cs=10-cs%10
return (cs==10)?"X":cs ""
}
#convert an ISBN13 with 978 prefix to an ISBN10
#preserve original formatting (via "-")
function isbn_13to10(s, origs){
if((s=isbn_sanitize(origs=s)) !~ /^[0-9]+[0-9X]$/){
print "Junk characters in ISBN." > "/dev/stderr"
exit(isbn_errbase+0)
}
if(length(s)!=13){
print "The ISBN doesn't contain 13 characters." > "/dev/stderr"
exit(isbn_errbase+1)
}
if(s !~ /^978/){
print "Only a 978-prefix ISBN-13 can be converted to ISBN10." > "/dev/stderr"
exit(isbn_errbase+2)
}
if(!isbn_check(s)){
print "Invalid checksum." > "/dev/stderr"
exit(isbn_errbase+3)
}
return substr(origs,4+(origs ~ /^978-/),length(origs)-4-(origs ~ /-.$/)) ((origs ~ /-.$/)?"":"-") isbn_chksum10(substr(s,4,9))
}
#convert an ISBN10 to a 978-prefix ISBN13
#preserve original formatting (via "-")
function isbn_10to13(s, origs){
if((s=isbn_sanitize(origs=s)) !~ /^[0-9]+[0-9X]$/){
print "Junk characters in ISBN." > "/dev/stderr"
exit(isbn_errbase+0)
}
if(length(s)!=10){
print "The ISBN doesn't contain 10 characters." > "/dev/stderr"
exit(isbn_errbase+1)
}
if(!isbn_check(s)){
print "Invalid checksum." > "/dev/stderr"
exit(isbn_errbase+3)
}
return "978-" substr(origs,1,length(origs)-1) ((origs ~ /-.$/)?"":"-") isbn_chksum13("978" substr(s,1,9))
}
#what kind of ISBN is this?
#/^[0-9Xx -]{9,?}/ --> 10,13,0(error)
function isbn_id(s){
if((s=isbn_sanitize(s)) !~ /^[0-9]+[0-9X]$/)
return 0
switch(length(s)){
case 10: return 10
case 13: return 13
default: return 0
}
}
#check an ISBN(10 or 13) for validity. whitespace and "-" get ignored.
#/^[0-9Xx -]{13,?}/ --> |B
function isbn_check(s){
if((s=isbn_sanitize(s)) !~ /^[0-9]+[0-9X]$/)
return 0
switch(length(s)){
case 10: return isbn_chksum10(s)==substr(s,10,1)
case 13: return isbn_chksum13(s)==substr(s,13,1)
default: return 0
}
}
-----isbn.awk
and
-----isbnconv.awk
#!/usr/home/charon/usr/local/bin/gawk-3.1.7-bc-d -f
@sourcefile "isbn.awk"
BEGIN{ print "Just enter either isbn10 or isbn13, as many as you want to,"
print "one per line. End on EOF, end, exit or quit."
}
/^#/{next}
/^[QqEe][UuNnXx][IiDd]/{exit(0)}
isbn_id($0)==10{print isbn_10to13($0);next}
isbn_id($0)==13{print isbn_13to10($0);next}
{print "Not a valid ISBN." > "/dev/stderr"}
-----isbnconv.awk
Example:
-----
charon@achernar:~/src/awklib> ./isbnconv.awk Just enter either isbn10 or isbn13, as many as you want to,
one per line. End on EOF, end, exit or quit.
123456789x
978-123456789-7
978-1-59582-204-8
1-59582-204-6
quit
-----
HTH
- --------------------------------chelImQo'----------------------------------- -
Sebastian F. Mix, Irenenstrasse 21a, D-10317 Berlin, Tel: ++4930 521 1034 /(a\
charon@cs.tu-berlin.de <-no NeXTmail GCode3.12 GCS/S d?- s+:- a E--- C+(+) \p)/
USX+ P- L- W++ N+++ w--- M- !V PS+++ Y+ PGP+ 5+ X++ R-- b++(+) e+ h+ r-- y*
|
|
0
|
|
|
|
Reply
|
charon
|
2/4/2010 1:51:07 PM
|
|
In article <3d0b51b1-6c7e-4128-8817-48d7bdc198dc@m4g2000vbn.googlegroups.com>,
Tim Menzies <menzies.tim@gmail.com> wrote:
>On Feb 4, 1:45�am, arn...@skeeve.com (Aharon Robbins) wrote:
>> � � � � awk 'BEGIN { print "hello, word" }'
>>
>> I am pleased to announce the availability of a test version of gawk.
>> This version uses a byte-code execution engine, and most importantly,
>> it includes a debugger that works at the level of awk statements!
>
>great news. very exciting!
>
>any comments on the performance hit/improvement of running as byte
>codes?
No comments - some things seem faster, some slower. It seems to be a wash
overall, performance wise. But the gain of a debugger makes it worthwhile.
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381
Nof Ayalon Cell Phone: +972 50 729-7545
D.N. Shimshon 99785 ISRAEL
|
|
0
|
|
|
|
Reply
|
arnold
|
2/4/2010 3:56:03 PM
|
|
In article <7t01mbFlfaU1@mid.dfncis.de>, <charon@cs.tu-berlin.de> wrote:
>(would it hurt to make --enable-switch the default by now?)
It will be the default for the next major version.
But it should work in the experimental version if you enable it.
Thanks,
Arnold
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381
Nof Ayalon Cell Phone: +972 50 729-7545
D.N. Shimshon 99785 ISRAEL
|
|
0
|
|
|
|
Reply
|
arnold
|
2/4/2010 3:57:18 PM
|
|
In article <hkeqmi$ahl$1@news.bytemine.net>,
Aharon Robbins <arnold@skeeve.com> wrote:
....
>>great news. very exciting!
>>
>>any comments on the performance hit/improvement of running as byte
>>codes?
>
>No comments - some things seem faster, some slower. It seems to be a wash
>overall, performance wise. But the gain of a debugger makes it worthwhile.
For the benefit of those of us in the cheap seats, could you say a bit
about what this "byte code interpreter" is? I guess most of us just
assumed it was a speedup (more efficient way of doing things), but this
seems not to be the case. What you say above implies that the BCI is a
necessary change in order for the debugger concept to work. Is that
correct?
|
|
0
|
|
|
|
Reply
|
gazelle
|
2/4/2010 4:08:45 PM
|
|
In article <hkered$87o$1@news.xmission.com>,
Kenny McCormack <gazelle@shell.xmission.com> wrote:
>In article <hkeqmi$ahl$1@news.bytemine.net>,
>Aharon Robbins <arnold@skeeve.com> wrote:
>...
>>>great news. very exciting!
>>>
>>>any comments on the performance hit/improvement of running as byte
>>>codes?
>>
>>No comments - some things seem faster, some slower. It seems to be a wash
>>overall, performance wise. But the gain of a debugger makes it worthwhile.
>
>For the benefit of those of us in the cheap seats, could you say a bit
>about what this "byte code interpreter" is? I guess most of us just
>assumed it was a speedup (more efficient way of doing things), but this
>seems not to be the case. What you say above implies that the BCI is a
>necessary change in order for the debugger concept to work. Is that
>correct?
Very good questions.
Production gawk builds a parse tree at parse time and then recursively
evaluates it to execute the program. This is fine, but there are lots
of function calls.
The byte code version generates a linked list of "instructions" that
is iterated over in a loop for execution. Byte code interpreters in
general can be faster than recursive evaluaters when done really well;
this is the general case for mawk. (Not surprising, either; Mike Brennan
is a brilliant programmer.)
The person who did the byte code engine at the same time integrated the
hooks and wrote a parser and executer for an awk-level debugger.
There is no theoretical requirement to have a byte code engine in order
to provide a debugger, it is just that all the code came together as
a package.
We have not investigated performance in the byte code version *at all*.
All our efforts up to this point have been focused on getting the code
integrated into the current code base such that it passes the test suite
and ports to all the systems regular gawk ports to.
Two significant bugs have been reported to me in the past 24 hours where
gawk-bc differs from regular gawk. I'm working on them.
Thanks,
Arnold
--
Aharon (Arnold) Robbins arnold AT skeeve DOT com
P.O. Box 354 Home Phone: +972 8 979-0381
Nof Ayalon Cell Phone: +972 50 729-7545
D.N. Shimshon 99785 ISRAEL
|
|
0
|
|
|
|
Reply
|
arnold
|
2/5/2010 9:20:37 AM
|
|
|
6 Replies
178 Views
(page loaded in 0.104 seconds)
Similiar Articles: VICE 2.3 has been released. - comp.sys.cbmThe most important changes from the last version can ... - CRT Emulation (former "PAL emulation") is now available ... - Added Debugger Windows for Memory, Register and ... A code generator written in AWK - comp.lang.awkNow, I thinking after taking a look to abc ... source code for gawk from prep.ai.mit.edu. gawk was a version of new awk written by David Trueman and Arnold, and available ... efficiency in awk - comp.lang.awk... cases you might get efficiencies in both coding/debugging ... (In the later version, we can use "DEBUG=x ./shell_wrapper ... But I understand now that copying the functions in ... Configuring XVideo to work with Xsun (sparc) - comp.unix.solaris ...... on XF86 on Solaris Sparc being experimental and such - I believe this is the basis of Xorg under the x86 version of ... is still Xsun only as there are no available ... LPT initialization - comp.soft-sys.matlabInstead, it will be available as a separate download. ... 10 Operating System: Microsoft Windows XP [Version ... Errrors while debugging in Compaq Visual Fortran 6.6 ... Commercial Fortran Compilers - comp.lang.fortran... don't remember any issues with it; also the debugger ... mentioned yet that libraries are available that will let ... Software Network Free Non-Commercial "WhatIf" Experimental ... Converting number to std::string ("itoa()" ) - comp.lang.c++ ...This is presumably only available in their library though ... Now if we were to tackle the problem in C++ rather ... should return true for an Nl or not; in my experimental ... Something bad happened to my gfortran installation - comp.lang ...... registry Thread model: win32 gcc version 4.6.0 20100626 (experimental ... The funny thing (after hours of debugging) was ... used to have all of its old builds available on ... Integers on 64-bit machines - comp.compilersIn an experimental/self-educational lanuguage project I'd ... types can not only simplify the design and debugging of a ... Now tell me please, how one can prevent overflows ... .g64 images - comp.sys.cbm... Amiga, rather than UAE, espesially the LoseUAE=20 version ... htm I am sure he would love some help with debugging and ... Now take an example of such a IRQ driven Burstnibbler ... Gawk 4.0.0 Now Available - lists.gnu.org... the next major release of GNU Awk: version 4.0.0. ... Adds dgawk debugger and possibly improved ... Gawk 4.0.0 Now Available, Aharon Robbins <= GAWK - devdaily.com | java, linux, mac os x, iphone, perl, drupal ...It now affects string comparisons as well. NF. The ... conversions described in that man page are available to gawk. ... your operating system and its revision, the version of gawk ... 7/9/2012 7:05:02 PM
|