problems@gmail schrieb:
> I've just looked at awk's docos for the first time and it seems
> suitable for implememting an augumented-finite-state-machine.
Right.
> At each/most node there's an action to be performed.
> Some nodes are not terminals/leaves and just lead to
> another nested syntax-diagram via push & return via pop.
What you describe is a recursive-descent compiler.
Terminals are consumed by the scanner.
Non-terminals are handled by a dedicated function.
> I've built several p-code compilers like this.
Me too, with AWK.
> From what I see from awks docos, it's very suitable for
> doing the [token,action] task, as the source-code is
> scanned into tokens and passed to awk.
Right, but remember that the traditional distinction
between a low-level "scanner" (with regular expressions)
and a high-level "parser" (doing recursive-descent parsing)
should be made.
> As I remember, at the branch-points a one token look ahead
> only was needed for the pascal-like syntax.
Yes, it was called the LL(1). Wirth always designed
his languages so that they can be parsed with a
lookahead of one symbol.
> Can awk handle this ?
>
> How would it handle the branching, eg. the sub-diagram for
> statement would be like:
> Case TOKEN
> 'IF': next state := IF-node;
> 'WHILE: next state := WHILE-node;
> ...
I have implemented a compiler for PL/0 and I
did it this way:
http://en.wikipedia.org/wiki/PL/0
function Statement( m, n) {
if (EatSym("ident", -1)) {
n = FindIdent(tokenprev, "variable")
EatSym(":=", 14)
Expression()
Encode(0xb3, n, 2)
} else if (EatSym("CALL", -1)) {
EatSym("ident", 12)
Encode(0xb8, FindIdent(tokenprev, "procedure"), 2)
} else if (EatSym("BEGIN", -1)) {
do { Statement() } while (EatSym(";", -1))
EatSym("END", 9)
} else if (EatSym("IF", -1)) {
Condition()
EatSym("THEN", 10)
n = pc
Encode(0xa0, 0, 2)
Statement()
PatchAddr(n, pc)
} else if (EatSym("WHILE", -1)) {
n = pc
Condition()
EatSym("DO", 1)
m = pc
Encode(0xa0, 0, 2)
Statement()
Encode(0xa7, n-pc, 2) # goto
PatchAddr(m, pc)
}
}
|