From start on antlr grammar for E: Here is a still incomplete but reasonable initial cut at the E grammar redone for Antlr (I should probably change the production order for clarity). There is a grammar (and lexer) for E and a grammar and lexer for the nested quasiparser language. The real thing should have a similar nested grammar for doc comments (which is the canonical example of nested grammars for antlr). Mostly you want to look at e.g and quasi.g. Many tree-building actions are filled in, though some are still wrong or missing. The lexers are not particularly complete, since that was not the primary point. They should be straightforward to enhance to cover E. The grammar is built (using antlr) something like: C:\tools\jdk1.4\bin\javaw.exe -classpath rt.jar;antlr.jar antlr.Tool e.g C:\tools\jdk1.4\bin\javaw.exe -classpath rt.jar;antlr.jar antlr.Tool quasi.g eMain.java takes my sequence of examples from example.e: def foo {} def bar { to baz(a :int) {^a} } def func(a,b) {a | b | false} !3 def z def x := 5; y 3>4;; { 67 x *= 6 } [3,4,] if(33){43}; a := 4**5 * 3; a.fry(4, b*4); x(45,"hello"); a[4]:=5; 33 33+43+4 ; `identest$id` `etest${345}a` `hello` e`go` and produces: C:\tools\jdk1.4\bin\javaw.exe -classpath rt.jar;antlr.jar eMain [,<SeqExpr>] [def,<ObjectExpr>] [,<FinalPattern>] [foo,<IDENT>] [{,<EScript>] [def,<ObjectExpr>] [,<FinalPattern>] [bar,<IDENT>] [{,<EScript>] [to,<EMethod>] [baz,<IDENT>] [,<List>] [,<FinalPattern>] [a,<IDENT>] [int,<IDENT>] [^,<ReturnExpr>] [a,<IDENT>] [def,<ObjectExpr>] [,<FinalPattern>] [func,<IDENT>] [,<List>] [,<FinalPattern>] [a,<IDENT>] [,<FinalPattern>] [b,<IDENT>] [,<CallExpr>] [,<CallExpr>] [a,<IDENT>] | [b,<IDENT>] | [false,<IDENT>] [,<CallExpr>] [3,<INT>] ! [def,<DefineExpr>] [z,<IDENT>] [def,<DefineExpr>] [,<FinalPattern>] [x,<IDENT>] [5,<INT>] [y,<IDENT>] [,<CallExpr>] [3,<INT>] > [4,<INT>] [,<HideExpr>] [67,<INT>] [,<AssignExpr>] [x,<IDENT>] *= [6,<INT>] [[,<TupleExpr>] [3,<INT>] [4,<INT>] [if,<IfExpr>] [33,<INT>] [43,<INT>] [:=,<AssignExpr>] [a,<IDENT>] [,<CallExpr>] [,<CallExpr>] [4,<INT>] ** [5,<INT>] * [3,<INT>] [.,<CallExpr>] [a,<IDENT>] [fry,<IDENT>] [4,<INT>] [,<CallExpr>] [b,<IDENT>] * [4,<INT>] [run,<CallExpr>] [x,<IDENT>] [run,<STRING>] [45,<INT>] ["hello",<STRING>] [:=,<AssignExpr>] [get,<CallExpr>] [a,<IDENT>] [get,<STRING>] [4,<INT>] [5,<INT>] [33,<INT>] [,<CallExpr>] [,<CallExpr>] [33,<INT>] + [43,<INT>] + [4,<INT>] [simple,<QuasiLiteralExpr>] [simple,<STRING>] [,<QuasiContent>] [identest,<QUASIBODY>] [id,<QIDENT>] [simple,<QuasiLiteralExpr>] [simple,<STRING>] [,<QuasiContent>] [etest,<QUASIBODY>] [345,<INT>] [a,<QUASIBODY>] [simple,<QuasiLiteralExpr>] [simple,<STRING>] [,<QuasiContent>] [hello,<QUASIBODY>] [,<QuasiLiteralExpr>] [e,<IDENT>] [,<QuasiContent>] [go,<QUASIBODY>] The one important Antlr sytax I forgot to put in comments is "(a b) => x y", which looks ahead, and if it finds a then b, it chooses the parser direction x y. This allows disambiguation in a few key productions. |
||||||||||||||||||||||||||
Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.
|