f



Writing a compiler compiler

Hello,

Firstly, let it be known that I am quite new to the subject of parser
generators, and having written only simple parsers for frivoulous
grammars by hand have little experience.

My objective is thus - to create a parser generator that will
eventually generate a parser for a dynamic language. However, my
undestanding of different parser types and distinctions between is
minimal; but as the web is a good enough resourse please assume my
knowledge of those areas.

The target for this parser generator would be a language similar to
C#, and I have found Jay (a yacc clone to C# and Java). The initial
implementation language is to be C++.

So here goes my question - how do you start writing a parser generator?
I am probably considering a LALR parser, but what are the main
differences between the different types? Specifically, why is an LR
parser unable to parse Python or C++? [source: Wikipedia]

Thank you for your time,
Vladimir Lushnikov
[These topics are all covered in compiler textbooks, many of which are
listed in the FAQ.  Assuming by "dyamic language" you mean that the
syntax can change on the fly, you should learn about Early and Tomita
parsers, and pay particular attention to the reasons that even though
they're technically perfectly sound, few people use them.  As to why
an LR parser can't parse Python or C++, LR parsers only handle a
subset of BNF.  In particular they can't handle the ambiguity of C++
declarations.  -John
0
4/17/2006 5:36:56 AM
comp.compilers 3310 articles. 1 followers. Post Follow

3 Replies
1486 Views

Similar Articles

[PageSpeed] 34

 John Levine wrote:

> As to why an LR parser can't parse Python or C++, LR parsers only
> handle a subset of BNF.  In particular they can't handle the
> ambiguity of C++ declarations.

There appears to be techniques around this for C++, getting even down to
LALR(1):
  http://www.parashift.com/c++-faq-lite/compiler-dependencies.html#faq-38.11

--
  Hans Aberg

0
haberg
4/17/2006 2:49:23 PM
> So here goes my question - how do you start writing a parser generator?
> I am probably considering a LALR parser, but what are the main
> differences between the different types? Specifically, why is an LR
> parser unable to parse Python or C++? [source: Wikipedia]

I think you start writing a parser generator by writing a parser.
Once you have the parsing details down, writing the traditional
yacc interface on top is pretty simple.  The hard part is getting
the parsing engine running in the first place.

Even if you need something more powerful than LALR, I'd start
with LR(0) and SLR just to cement your understanding of
whta's going on.  I recently implemented a GLR parser.
It was only a reasonable amount of work because I first
made sure I understood LR(0) and then SLR and LR(1) well.

Although few people do use Earley and Tomita parsers in
practice now, I think general approaches, especially GLR,
are gaining ground.

Some references I have found useful:

John Aycock and Nigel Horspool,
Directly-Executable Earley Parsing, CC 2001

John Aycock and Nigel Horspool,
Faster Generalized LR Parsing, CC 1999.

Bryan Ford,
Parsing Expression Grammars: A Recognition-Based Syntactic Foundation,
POPL 2004

Scott McPeak and George C. Necula,
Elkhound: a Fast, Practical GLR Parser Generator,
CC 2004

Scott McPeak,
Elkhound: a Fast, Practical GLR Parser Generator,
UC Berkeley Tech Report UCB/CSD-2-1214, December 2002
(a longer more detailed version of the CC paper)

Russ
0
Russ
4/17/2006 2:49:40 PM
Vladamir Lushnikov said:

> My objective is thus - to create a parser generator that will
> eventually generate a parser for a dynamic language. However, my
> undestanding of different parser types and distinctions between is
> minimal; but as the web is a good enough resourse please assume my
> knowledge of those areas.

> The target for this parser generator would be a language similar
> to C#,...

and our moderator noted:

> [These topics are all covered in compiler textbooks, many of which are
> listed in the FAQ.  Assuming by "dyamic language" you mean that the
> syntax can change on the fly, you should learn about Early and Tomita
> parsers, and pay particular attention to the reasons that even though
> they're technically perfectly sound, few people use them.  As to why
> an LR parser can't parse Python or C++, LR parsers only handle a
> subset of BNF.  In particular they can't handle the ambiguity of C++
> declarations.]  -John

Chapter 9 (section 9.3.3) of Adapting to Babel (see sig line below for link)
covers the parsing of Perl and C++ using adaptive grammars, and deals with
C++ declarations, including the one case in C++ where declarations can
follow the first use of a call (that is, member functions called by member
functions in inline member functions in a class declaration). The C++
grammar used in the experiments has some 292+ productions, but it handles
quite a lot of C++'s constructions, and does so without any disambiguating
code. In a sense, a language like C++ might be said to be "mildly
dynamic" -- although this might be debated. (For instance, i++ is
syntactically legal, but semantically so only if the type of i has the ++
operator defined.)

Anyone interested in dynamic languages might, therefore, consider ATB as a
newly available look at such parsers.

Part of what makes writing grammars for such languages more manageable is
the fact that they can be developed in an IDE, much like programming
languages, and "debugged" in minute detail (and profiled) against test
cases. To date, the engine in its current incarnation has shown no
real-world language parses in O(n^3) or greater time, and in many cases, on
real-world languages, expensive productions are mitigated by their
locality -- that is, the most expensive productions tend to fire only
rarely, and thus not slow down the whole. (Of course, in theory-space, such
productions would predominate the big-O, but in practice-space, there is no
such thing as an infinitely long input.)

Perl in particular has some constructions that make adaptive grammars very
useful. The C# grammar put together in this formalism didn't require a
single adaptive production (and Vladamir mentions his goal is a language
similar to C#), so it is likely that any adaptive productions would be
minimal, and thus mitigated.

Cheers.

--
Quinn Tyler Jackson FRSA MBCS
http://members.shaw.ca/qtj/

Author of Adapting to Babel:
http://members.shaw.ca/qtj/writing/cs/AdaptingToBabel.html
0
Quinn
4/18/2006 3:42:57 AM
Reply:

Similar Artilces:

Compile or not compile?
There's something that sometimes in while it bores me a little bit. How do I know if I need to compile a function/procedure or it will be able to compile "on fly". At the beginning I thought this was related to the paths idl "knew", but this happens for two functions/routines in the same directory. Can someone tell when a function can be runned without be compile? Nuno Oliveira wrote: > There's something that sometimes in while it bores me a little bit. How > do I know if I need to compile a function/procedure or it will be able > to compile &quo...

To compile or not to compile
Is there some benefit in compiling the apache source yourself? -- //Points ------------------------------------------------------------ http://underthebed.homeip.net ...

Compiled or not compiled
I need to distinguish between two situations: one when a Matlab function is run from Matlab command window and the other one when it is run in compiled mode. Any ideas how to do this? Thanks Tomy Duby "Tomy Duby" <tomy.duby@agilent.com> wrote in message news:idq8q8$85k$1@fred.mathworks.com... > I need to distinguish between two situations: one when a Matlab function > is run from Matlab command window and the other one when it is run in > compiled mode. > Any ideas how to do this? HELP ISDEPLOYED. -- Steve Lord slord@mathworks.com comp.s...

Compiling Compiler
Hi there, I need some advice/opinion of the experts out there regarding compilers. I have sun cc compiler that supports 64 bit environment. Now, I compile ACK using this compiler.This should give a compiler that supports 64bit environment. Since the output files are in Solaris format, I compile the ACK source *once again* using the just compiled ACK compiler so that I get Minix object and binary files. Now my questions is that does this 2nd version of ACK be able to produce true 64bit code? Sanky wrote: > Hi there, > > I need some advice/opinion of the experts out there regarding...

To compile or not to compile
Is there some benefit in compiling the apache source yourself? -- //Points ------------------------------------------------------------ http://underthebed.homeip.net ...

which compiler compiled?
Is there a way I can tell which compiler compiled an executable? I looked at the file with a hex editor and didn't see anything obvious. Lawrence "Lawrence" <just4me@nowhere.com> writes: > Is there a way I can tell which compiler compiled an executable? Yes with some compilers and some executable formats. E.g. on HP-UX for PA-RISC: $ aCC leak.C $ odump -compunit a.out ... 6 0 ANSI C++ leak.C /tmp ctcom options = -inst compiletime -diags 523 -inline_power 1 -longbranch 2 -unique_strings on -cachesize 256 B...

compiling Qt .... compiling Qt ....compiling Qt.... compiling Qt
I got myself new kernel from kernel.org... make xconfig needs Qt (new system). So I got new Qt source this afternoon. It is still compiling.... Who said MS is bloat has not done this. What does Qt do? Dunno, how about xforms for xconfig? On Thu, 02 Feb 2006 19:18:09 +0000, Jan Panteltje wrote: > Dunno, how about xforms for xconfig? Why not curses based? menuform -- Hilsen/Regards Michael Rasmussen http://keyserver.veridis.com:11371/pks/lookup?op=get&search=0xE3E80917 On a sunny day (Thu, 02 Feb 2006 21:55:33 +0100) it happened Michael Rasmussen <mir@miras.or...

program that compiles in C compiler but not in C++ compiler
Hi, I need a small program that compiles in C compiler but not in C++ compiler. Thx in advans, Karthik Balaguru KBG <karthik.balaguru@lntinfotech.com> wrote: > I need a small program that compiles in C compiler but not in C++ > compiler. No problem, just send $10 to paypal@zevv.nl and I'll do your homework for you. -- :wq ^X^Cy^K^X^C^C^C^C KBG said: > Hi, > > I need a small program that compiles in C compiler but not in C++ > compiler. Can you think of any syntactic differences between C and C++? For example, what about keywords? They are very, very sen...

Compiling with the eclipse compiler
Hey all,I have a weird problem (at least it's weird to me).I have a big project that I need to compile through ant with eclipse'scompiler.In the build.xml my compile target looks like this : <target name="compile" depends="release-settings, init" description="compiles everything from source"> <echo message="Compiling with debug = ${build.debug}"/> <depend srcdir="${src.dir}" destdir="${class.dir}" cache="${dependencies.dir}" closure="true"/> <javac srcdir=...

Meta4... a compiler-compiler
in the csa2 gmail account... description is very interesting. in the Pidgin for Apple II/Merlin/6502 message Rich ...

compiling on different compilers
I am working on a set of functions that involve dynamic memory allocation. I started working with gcc and then moved to Pacific C. The reason was that my only access to gcc was through an ssh connection that I did not have access to all the time, so I got Pacific so I could work on my home windows computer. Then I moved back to gcc, which saved me from some major mistakes. Somehow Pacific C was letting me get away with stuff that was completely wrong. gcc immediately gave me segmentation errors until I got every last thing correct. Is gcc the best for catching mistakes? If not, which on...

which compiler was used to compile?
Is there a slick way to invent a constant in a program that tells you which compiler you used. I distribute both JET and JavaC versions of my code, and it helps to know on any error dump which version they were using. Right now I manually change a static final just before compiling. I thought there ought to be something less error prone. -- Canadian Mind Products, Roedy Green. Coaching, problem solving, economical contract programming. See http://mindprod.com/jgloss/jgloss.html for The Java Glossary. Roedy Green wrote: > Is there a slick way to invent a constant in a program that tell...

Problem to compile with g77 a program which was normally compiled with Absoft compiler
Hello, I get the following error messages when I try to compile a program with g77 somefile.f:(.text+0x93): undefined reference to `for_open' somefile.f:(.text+0xf4): undefined reference to `for_write_seq_fmt' somefile.f:(.text+0x128): undefined reference to `for_write_seq_fmt_xmit' somefile.f:(.text+0x454): undefined reference to `for_read_seq' Could you help me to fix this? I have posted more details about my actions on http://stackoverflow.com/questions/3365742/f77-problem-to-compile-with-g77-a-program-which-was-normally-compiled-with-absof Thank you On 30/07/2010 10:2...

Compiler mystery: Compiles classes that don't compile
Can someone please try and explain how this could even happen. I'm about to start fixing some bugs in a program for a client. They haven't really been that organised and just gave me some source code from somewhere, which may or may not be the version of the source code that was used to compile the version of the app that I am supposed to fix bugs in. Now to the problem. The code doesn't even compile. There's a file Container.java, which is missing heaps of methods (private ones I guess as they are only called from inside Container). I decompiled the actual application and whe...

Web resources about - Writing a compiler compiler - comp.compilers

Compiler - Wikipedia, the free encyclopedia
... , or external linking . The most common reason for wanting to transform source code is to create an executable program. The name "compiler" ...

Compiler - Wikipedia, the free encyclopedia
"Compile" and "compiling" redirect here. For the software company, see Compile (publisher) . For other uses, see Compilation . This article has ...

Facebook Open-Sources HipHop PHP Compiler Software
Earlier this morning, Facebook officially made their new PHP “compiler,” called HipHop, available as open source software. In the blog post by ...

Art in the Age of Matter Compilers
jurvetson posted a photo: Sheba may be the harbinger of art in the digital age — a mathematical sculptor with digital matter. She manipulates ...

Interpreters and Compilers (Bits and Bytes, Episode 6) - YouTube
This animation explains the difference between interpreters and compilers. It is from Episode 6 of the classic 1983 television series, Bits and ...

Typesafe cofounder forking Scala compiler
The main contributor to the Scala compiler, Paul Phillips, has announced on GitHub that he is forking the compiler to “fix some of the innumerable ...

Does Apple's new developer agreement ban Adobe's Flash-to-iPhone compiler?
Given that any kind of formal truce between Apple and Adobe was essentially blown out of the water by Steve Job's very public slating of Flash ...

Apple seeds devs with Safari 5.2 for Lion, Xcode 4.4 with new LLVM compiler
... to the general public this summer. Among the new features: According to Apple, Xcode 4.4 includes an editor for Collada 3D files, compiler support ...

NVIDIA and Continuum Analytics Announce NumbaPro, A Python CUDA Compiler
... are announcing that they are bringing Python support to CUDA. Specifically, Continuum Analytics’ will be introducing a new Python CUDA compiler, ...

IntelliJ Releases IDEA 12, Brings Improved UI, New Compiler Mode, Android UI Designer, And More
I'm not going to pretend to be a developer here, and I'll openly admit that the bulk of what IDEA 12 does is over my head. However, I do understand ...

Resources last updated: 1/24/2016 8:08:35 AM