Expression Grammar

/ elang / grammar

Expression
Grammar

The expression subset of the E grammar is presented here in precedence order, meaning that later constructs bind tighter than earlier constructs. For example, expr "+" expr is presented before expr "*" expr, so "*" binds tighter than "+". Therefore, "a + b * c + d" is equivalent to "a + (b * c) + d". All the constructs in presented in the same box have the same precedence.

The title of each box states the associativity of the box, which states how members of the box are grouped when they are composed with each other. The possibilities are:

Left associative. E.g., "a - b - c" groups as "(a - b) - c".
Right associative. E.g., "a := b := c" groups as "a := (b := c)".
Don't care associative. E.g., it doesn't matter how "a && b && c" is grouped.
Non associative. E.g., "a == b == c" is a syntax error.

As is familiar, parentheses can be used to obtain any desired grouping, and should be used even, if they aren't strictly necessary, for those cases where the meaning might otherwise be unclear to a reader.

The E language as seen by the programmer has the rich set of syntactic conveniences expected of a modern scripting language. However, to be secure, E must have a simple analyzable semantics. We reconcile these by defining a subset of the full language called Kernel E, and only this subset need be given a rigorous semantics. The rest of E is defined by syntactic expansion to this subset. In the expansion column for each construct below, either an expansion into more primitive constructs (closer to Kernel E) is shown, or the word "kernel" appears as a link to the corresponding place in the Kernel-E Reference Manual. When an expansion is shown, "l" stands for the left-hand expression, and "r" for the right-hand expression.

Sequence (don't care associative)

Grammar	Meaning	Expansion
"\n"*	nothing	null
expr "\n"* expr	Do is order. Value is last value	kernel

This is the familiar sequence of expressions, one per line. C, C++, and Java programmers should notice that newlines are normally used, rather than semicolons, to terminate E expressions. (Python and Tcl programmers will find this familiar.) As with Tcl, this enables E to be used as a pleasant command line language as well.

Whereas C and most of its descendents are statement languages, like Smalltalk, Lisp and others, E is an expression language. In a statements language, there are many statements (like C's "if" statement) that cannot yield a value. These languages also have expressions (like "i + j") that do yield a value. In an expression language, everything that would have been a statement is defined to yield a value, and is therefore allowed to nest freely within other expressions.

E's sequence expression is the first example of this. Not only are the component expressions evaluated in order, but the value yielded by the sequence expression as a whole is the value yielded by its last component expression.

In-Line Sequence (don't care associative)

Grammar	Meaning	Expansion
expr ";"	"`;`" is harmless	l
expr ";" expr	Do is order. Value is last value	l "\n" r

You can also write a sequence on one line by separating them with semicolons. Terminating an expression with a semicolon is harmless.

Assignment and Definition (right associative)

Grammar	Meaning	Expansion
varName ":=" expr	assignment	kernel
expr "." get(...) ":=" expr expr "." getName(...) := expr expr "." run(...) := expr	table update setting a bean property replacing a run	l.put(..., r); r l.setName(..., r); r l.setRun(..., r); r
lval "\|=" expr lval "&=" expr lval "^=" expr lval "<<=" expr lval "+=" expr lval "-=" expr lval "=" expr* lval "/=" expr lval "//=" expr lval "%=" expr lval "%%=" expr lval "*=" expr*	update, ie, assign a value derived from the old value	l := l \| r l := l & r l := l ^ r l := l << r l := l + r l := l - r l := l * r l := l / r l := l // r l := l % r l := l %% r l := l ** r
lval ">>=" expr	update by right shift	l <<= -r
"def" patt ":=" expr "def" patt "exit" expr ":=" expr	define variables by pattern matching	kernel(*)

As in Pascal, the E assignment operator is ":=", rather than "=". "=" was originally the equality symbol from mathematics, which has a much longer and more honorable history than Fortran or C, so it would be offensive for E to use "=" for assignment, but it would be confusing for E to use "=" for equality. Instead, E uses ":=" for assignment, "==" to test whether two things are the same, and "<=>" to test whether two things are as big as each other (same magnitude).

An lval is first parsed and expanded as an expression. After this lval-expansion, the lval must either be a name or one of the three kinds of call expression shown above. When these occur, they are rewritten as shown, except that r is evaluated only once. The lval-expansions coupled with the above assignment-expansions have the following pleasant effects:

Example	lval-Expansion	Assignment Expansion
table[key] := val	table.get(key) := val	table.put(key, val); val
jframe::border := b	jframe.getBorder() := b	jframe.setBorder(b); b
flist(3,5) := subl	flist.run(3,5) := subl	flist.setRun(3,5, subl); subl

As is familiar from C and its descendents, "a <op>= b", where <op> is a binary operator, means the same thing as "a := a <op> b", except that "a" is rewritten if necessary so that it's only executed once.

Since, as seen below, right shift expands to a left shift with a negated right-hand side, we similarly expand right-shift-assign to left-shift-assign with a negated right-hand side.

The only defining occurrences of variables occur in patterns, and these variable only come to exist -- and be bound to initial values -- when these patterns are matched against some value (referred to as the specimen). The define expression is the main form for explicitly using pattern matching to define new variables. Hence its name. "def" is just shorthand for "define".

Conditional-Or (don't care associative)

Grammar	Meaning	Expansion
expr "\|\|" expr	left-to-right or	See text

As is familiar, this or operator evaluates its operand expressions left-to-right, but stops on the first true.

Conditional-And (don't case associative)

Grammar	Meaning	Expansion
expr "&&" expr	left-to-right and	See text

As is familiar, this and operator evaluates its operand expressions left-to-right, but stops on the first false.

Comparisons and Bitwise/Logical Operators (non-associative)

Grammar

Meaning

Expansion

expr "==" expr
expr "!=" expr

are they the same?
are they different?

__equalizer.sameEver(l, r)
!(l == r)

expr "&" expr
expr "|" expr
expr "^" expr

meaning dependent on type

l.and(r)
l.or (r)
l.xor(r)

expr "=~" patt
expr "!~" patt

match
mismatch

kernel
!(l =~ r)

"==" and "!=" tests whether two values are the same, which is E's generalization of the traditional identity check (EQ to Lispers, "==" to C, C++, and Java'ers).

There's no general statement one can make about "&", "|", and "^", except that they expand to "and", "or", and "xor", so they mean whatever these messages mean on the objects in question. For the types built into E, the meanings are:

On integers, they are bitwise operators, as in C, C++, and Java.
On booleans, they are logical operators, but of course evaluate both sides.
On maps, "&" and "|" are intersection and union of the keys (domains), with the left-hand side dominating the values (range).
Similarly, on regions, "&" and "|" are the intersection and union of the positions in these regions.

Of course, you are free to make your objects respond to the "and", "or", and "xor" messages, in which case you can use "&", "|", and "^" to operate on them as well. This principle applies everywhere operators expand to messages.

We've borrowed from Perl "=~" and "!~", the pattern match operators. These attempt to match the value yielded by the expression of the left with the pattern on the right. "=~" says whether the match succeeded, whereas "!~" says whether it failed. Only if a "=~" match succeeds are the resulting variable bindings availble in the succeeding scope. Otherwise, the names are still defined (since this is a static property), but they are bound to broken references (as slots), which will throw an exception under normal variable use.

Partial Ordering (non associative)

Grammar

Meaning

Expansion

expr  "<"  expr
expr  "<=" expr
expr  ">=" expr
expr  ">"  expr
expr "<=>" expr

less than
less or equal
greater or equal
greater than
as big as

l.compareTo(r).belowZero()
l.compareTo(r).atMostZero()
l.compareTo(r).atLeastZero()
l.compareTo(r).aboveZero()
l.compareTo(r).isZero()

The four ordering operators all expand into "compareTo", followed by testing the result. "compareTo" is supposed to implement a partial ordering by responding with

a negative number if l is less than r.
zero if l is the same magnitude as r (equal to it in the ordering).
a positive number if l is greater than r.
NaN if l and r are incomparable.

Since a NaN is neither belowZero, atMostZero, atLeastZero, aboveZero, nor isZero, if l and r are incomparable, then all five comparisons will yield false. Otherwise, they will do as expected. The built in E types for which an ordering is defined are:

integers are fully ordered as you'd expect.
float64s are fully ordered as you'd expect, except NaN, which is incomparable to everything else. To comply with the IEEE spec, NaN is also incomparable with itself (causing compareTo to not be reflexive).
ConstLists compare lexicographically, meaning they are compared pairwise, and their ordering is according to their first unequal pair. Strings are a kind of ConstList, and this rule yields "alphabetical ordering" of Strings ordered by the unicode encoding of their characters.
ConstMaps test whether the keys (domain) of one is a subset of the keys (domain) of the other. If "a <= b", than "a"'s domain is a subset of "b"'s domain. If neither is a subset of the other, they are incomparable.
Regions test whether one is enclosed by the other, ie, whether the positions in one are a subset of the positions of the other.
Similarly, Types test whether one Type is a subtype of the other. That is, if "a <= b" then all objects which are members of the "a" type are also members of the "b" type.

There is an important difference between, for example, the seemingly identical tests "a <= b" and "b >= a". Although these both ask the same question, the first test asks "a"'s opinion while the second asks "b"'s opinion. Depending on your trust relationship to "a" and "b", you may care who you ask for an answer.

The traditional language specification would state "a's behavior and b's behavior must agree" (see for example the specification of "equals" in Java). However, when code is integrated from multiple sources, such specifications are incoherent. If "a" and "b" don't agree, who's at fault? In Java, once someone adds the following code to a system,

/**
 * A "more equal" class of objects
 */
public class OrwellPig {
    public boolean equals(Object other) {
        return true;
    }
    public int hashCode() {
        return 0;
    }
}

it is equally valid, from the Java spec, to say OrwellPig is buggy as it is to say OrwellPig is correct and the equals/hashCode behavior of all other classes is buggy. In a programming model intended to support the interaction of mutually suspicious code (which Java claims to be), such diffusion of responsibility is unacceptable.

Interval (non associative)

Grammar	Meaning	Expansion
expr ".." expr expr "..!" expr	inclusive-inclusive inclusive-exclusive	__makeOrderedSpace.op__thru(l, r) __makeOrderedSpace.op__till(l, r)

An interval expression is used to form a region that represents all the positions between two boundaries. 1..3 means all the integers from 1 inclusive to 3 inclusive. 1..!3 means all the integers from 1 inclusive to 3 exclusive, ie, "from 1 to3, but not 3", and so is equivalent to 1..2. Even though they are equivalent, the "..!" form is preferred, as E encourages thinking in terms of open-closed, or inclusive-exclusive intervals.

Currently such intervals are only supported for integers, float64s, and chars.

Shift (left associative)

Grammar	Meaning	Expansion
expr "<<" expr expr ">>" expr	left shift right shift	l.shiftLeft(r) l.shiftLeft(-r)

Among E's built in data types, this is only defined on integers, and has the traditional meaning but with no precision limit.

Additive (left associative)

Grammar	Meaning	Expansion
expr "+" expr expr "-" expr	addition, concatenation subtraction, difference	l.add(r) l.subtract(r)

On integers, float64s, and char/integer pairs, this has the traditional meaning.

On ELists, and therefore on Strings, "+" appends two lists to get another. This is a generalization of Java's use of "+" on Strings.

On EMaps, and therefore on sets, "-" takes the set difference of the domain, with the left-hand operand providing the range. In other words, "a - b" is that subset of "a" whose keys are not keys of "b"

Multiplicative (left associative)

Grammar

Meaning

Expansion

expr  "*" expr
expr  "/" expr
expr "//" expr
expr  "%" expr
expr "%%" expr

times
floating divide
integer divide
remainder
modulo

l.multiply(r)
l.approxDivide(r)
l.floorDivide(r)
l.remainder(r)
l.modulo(r)

expr ** expr %% expr

modular exponentiation

l.modPow(e, m)

Mostly, these apply to integers and float64s. See their documentation for details.

"*" also applies to an EList-integer pair, and has the traditional meaning of repeated addition. But remember that addition of lists has the non-traditional meaning of appending the lists together. Therefore, "a * n" will append "a" to itself "n" times. For example, "[1, 2] * 3" yields "[1, 2, 1, 2, 1, 2]".

"base ** exponent %% modulus" yields a result that's mathematically equivalent to "(base ** exponent) %% modulus", but the grammar does not expand the first to the second. Rather, it expands the first directly to a call involving all three operands since, at least for integers, this can be calculated vastly more efficiently than composing the two separate calculations. ***bug: in the current parser, even it you put in the above parenthesis, it will still expand directly to the trinary operation.

Exponentiation (non associative)

Grammar	Meaning	Expansion
expr "*" expr*	raised to the power of	l.pow(r)

Among E's built-in types, this applies only to integers and float64s in the expected way.

Unary Prefix (extremely non associative)

Grammar	Meaning	Expansion
"!" prim "~" prim "-" prim	not one's complement zero - expr	r.not() r.complement() r.negate()
"&" varName	slot of variable	kernel

Among E's built in types, "!" applies to booleans, "~" to integers, and "-" to integers and float64s, all in the standard way. On integers, unary "~" and "-" yield the precision-unlimited one's and two's complement, respectively.

Unary "&" exposes a deeper theory of scoping than the casual E programmer would normally ever have to be aware of. Where the expression "foo" yield the value of the variable named "foo", the expression "&foo" yields the Slot object holding the value of the variable "foo".

We say that unary prefix expressions are extremely non-associative because their component expressions can only be primitive expressions. Not only do they not associate with themselves, they don't associate with the unary postfix or call expressions.

Unary Postfix (left associative)

Grammar

Meaning

Expansion

expr "[" expr "]"
expr "::" propName
expr "::" propName"("expr,...")"

indexing
property access
indexed prop access

l.get(r)
l.getPropName()
l.getPropName(expr,...)

expr "<-" verb "(" expr,... ")"
expr "<-" verb
expr "<-"      "(" expr,... ")"

send, do eventually
simple send
functional send

E.send(expr, "verb", [expr,...])
l <- verb()
l <- run(expr,...)

"meta" "." verb "()"

static reflection

kernel

Unary Postfix and Call are documented together below.

Call (mostly left associative)

Grammar	Meaning	Expansion
postfix "." verb "(" expr,... ")" expr "(" expr,... ")"	call, do now function call	kernel l.run(expr,...)
"meta" "." verb "(" expr,... ")"	static reflection	kernel

(XXX need to add verb-curry syntax to these tables)

The grammar for synchronous (do it now) calls and asynchronous (do it eventally) sends is split between the Unary Postfix box and the Call box in order to resolve a parsing ambiguity. To do so, we violate our convention of presenting the grammar as a strict precedence hierarchy, in that the first grammar construct below uses postfix, meaning the first box, as part of its definition.

The postfix box contains three of the shorthand forms for synchronous calls. The first form shows that the familiar array indexing notation is just syntactic sugar for a one-argument "get" message.

The second form makes it straightforward to deal with JavaBeans properties. A JavaBean property "foo" is defined by a pair of a getFoo method for accessing it, and a setFoo method for setting it. The first letter of the property name is upper-cased according to JavaBeans property naming rules. Rather than writing "x.getFoo()", the E programmer may say "x::foo". To change a property, rather than say "x.setFoo(newFoo)", by virtue of the expansion of assignment, the E programmer may say "x::foo := newFoo". The third form enables one to likewise deal with JavaBeans indexed properties.

The fourth shorthand form is shown in the Call box. As long as an argument list is explicitly provided (even if it's empty), you may leave out the verb, which will default to "run". Such an expression is referred to as a function call.

Asynchronous calls are identified by the "<-" operator, which is read "eventually". For example, "counter <- incr()" is a request that the counter eventually increment itself. As with calls, the verb "run" may be left out.

The "meta" forms are an escape for allowing an object's program to make reflective queries about itself. These reflective queries must be carefully defined so they don't violate any security properties, and the easiest way to ensure this is to provide only conveniences for what the object could have acheived more awkwardly otherwise. Therefore, they could be defined by a source-to-source transformation, although the current E definition doesn't do so.

The meta forms are all introduced by the keyword "meta" so they will be apparent to static analysis. The only meta forms currently accepted are those that give an object easy access to information about itself. Classically, an object consists of state and behavior, so

"meta.getState"()
"scope" is not a keyword, but must be the verb that appears after meta. This expression yields the scope object that represents the bindings in-scope at that point in the program. These bindings only include variables defined outside the closest enclosing object if the code of this object uses this variable.

"meta.sourceTree(name)"
Similarly, "sourceTree" isn't a keyword, but but must be the provided verb. name must be the definition-name of an enclosing object expression. This meta form yield the parse tree of this object expression as canonically expanded to Kernel E. (XXX this has been replaced by meta.context(); explain)

These are used most often to create transparent objects -- object that auditably reveal their state and behavior to their clients. By auditably, we mean that the client does not need to trust the object's claim that it is reporting its internals accurately. Rather, it can trust an independent auditor that's in a position to check.

Primitive (no parse ambiguity to be resolved)

The last step of our precedence hierarchy are the primitive expressions, whose grammar is unambiguous, so they can be the "atomic" expressions all other all built from. They are documented on a separate page.

Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.

/ elang / grammar

ELib E Language Smart Contracts Related

Download FAQ API Mail Archive Donate

report bug (including invalid html)