The expression subset of the E grammar is presented here in precedence order, meaning that later constructs bind tighter than earlier constructs. For example, expr "+" expr is presented before expr "*" expr, so "*" binds tighter than "+". Therefore, "a + b * c + d" is equivalent to "a + (b * c) + d". All the constructs in presented in the same box have the same precedence. The title of each box states the associativity of the box, which states how members of the box are grouped when they are composed with each other. The possibilities are:
As is familiar, parentheses can be used to obtain any desired grouping, and should be used even, if they aren't strictly necessary, for those cases where the meaning might otherwise be unclear to a reader. The E language as seen by the programmer has the rich set of syntactic conveniences expected of a modern scripting language. However, to be secure, E must have a simple analyzable semantics. We reconcile these by defining a subset of the full language called Kernel E, and only this subset need be given a rigorous semantics. The rest of E is defined by syntactic expansion to this subset. In the expansion column for each construct below, either an expansion into more primitive constructs (closer to Kernel E) is shown, or the word "kernel" appears as a link to the corresponding place in the Kernel-E Reference Manual. When an expansion is shown, "l" stands for the left-hand expression, and "r" for the right-hand expression. Sequence (don't care associative)
This is the familiar sequence of expressions, one per line. C, C++, and Java programmers should notice that newlines are normally used, rather than semicolons, to terminate E expressions. (Python and Tcl programmers will find this familiar.) As with Tcl, this enables E to be used as a pleasant command line language as well. Whereas C and most of its descendents are statement languages, like Smalltalk, Lisp and others, E is an expression language. In a statements language, there are many statements (like C's "if" statement) that cannot yield a value. These languages also have expressions (like "i + j") that do yield a value. In an expression language, everything that would have been a statement is defined to yield a value, and is therefore allowed to nest freely within other expressions. E's sequence expression is the first example of this. Not only are the component expressions evaluated in order, but the value yielded by the sequence expression as a whole is the value yielded by its last component expression. In-Line Sequence (don't care associative)
You can also write a sequence on one line by separating them with semicolons. Terminating an expression with a semicolon is harmless. Assignment and Definition (right associative)
As in Pascal, the E assignment operator is ":=", rather than "=". "=" was originally the equality symbol from mathematics, which has a much longer and more honorable history than Fortran or C, so it would be offensive for E to use "=" for assignment, but it would be confusing for E to use "=" for equality. Instead, E uses ":=" for assignment, "==" to test whether two things are the same, and "<=>" to test whether two things are as big as each other (same magnitude). An lval is first parsed and expanded as an expression. After this lval-expansion, the lval must either be a name or one of the three kinds of call expression shown above. When these occur, they are rewritten as shown, except that r is evaluated only once. The lval-expansions coupled with the above assignment-expansions have the following pleasant effects:
As is familiar from C and its descendents, "a <op>= b", where <op> is a binary operator, means the same thing as "a := a <op> b", except that "a" is rewritten if necessary so that it's only executed once. Since, as seen below, right shift expands to a left shift with a negated right-hand side, we similarly expand right-shift-assign to left-shift-assign with a negated right-hand side. The only defining occurrences of variables occur in patterns, and these variable only come to exist -- and be bound to initial values -- when these patterns are matched against some value (referred to as the specimen). The define expression is the main form for explicitly using pattern matching to define new variables. Hence its name. "def" is just shorthand for "define". Conditional-Or (don't care associative)
As is familiar, this or operator evaluates its operand expressions left-to-right, but stops on the first true. Conditional-And (don't case associative)
As is familiar, this and operator evaluates its operand expressions left-to-right, but stops on the first false. Comparisons and Bitwise/Logical Operators (non-associative)
"==" and "!=" tests whether two values are the same, which is E's generalization of the traditional identity check (EQ to Lispers, "==" to C, C++, and Java'ers). There's no general statement one can make about "&", "|", and "^", except that they expand to "and", "or", and "xor", so they mean whatever these messages mean on the objects in question. For the types built into E, the meanings are:
Of course, you are free to make your objects respond to the "and", "or", and "xor" messages, in which case you can use "&", "|", and "^" to operate on them as well. This principle applies everywhere operators expand to messages. We've borrowed from Perl "=~" and "!~", the pattern match operators. These attempt to match the value yielded by the expression of the left with the pattern on the right. "=~" says whether the match succeeded, whereas "!~" says whether it failed. Only if a "=~" match succeeds are the resulting variable bindings availble in the succeeding scope. Otherwise, the names are still defined (since this is a static property), but they are bound to broken references (as slots), which will throw an exception under normal variable use. Partial Ordering (non associative)
The four ordering operators all expand into "compareTo", followed by testing the result. "compareTo" is supposed to implement a partial ordering by responding with
Since a NaN is neither belowZero, atMostZero, atLeastZero, aboveZero, nor isZero, if l and r are incomparable, then all five comparisons will yield false. Otherwise, they will do as expected. The built in E types for which an ordering is defined are:
There is an important difference between, for example, the seemingly identical tests "a <= b" and "b >= a". Although these both ask the same question, the first test asks "a"'s opinion while the second asks "b"'s opinion. Depending on your trust relationship to "a" and "b", you may care who you ask for an answer.
Interval (non associative)
An interval expression is used to form a region that represents all the positions between two boundaries. 1..3 means all the integers from 1 inclusive to 3 inclusive. 1..!3 means all the integers from 1 inclusive to 3 exclusive, ie, "from 1 to3, but not 3", and so is equivalent to 1..2. Even though they are equivalent, the "..!" form is preferred, as E encourages thinking in terms of open-closed, or inclusive-exclusive intervals. Currently such intervals are only supported for integers, float64s, and chars. Shift (left associative)
Among E's built in data types, this is only defined on integers, and has the traditional meaning but with no precision limit. Additive (left associative)
On integers, float64s, and char/integer pairs, this has the traditional meaning. On ELists, and therefore on Strings, "+" appends two lists to get another. This is a generalization of Java's use of "+" on Strings. On EMaps, and therefore on sets, "-" takes the set difference of the domain, with the left-hand operand providing the range. In other words, "a - b" is that subset of "a" whose keys are not keys of "b"
Multiplicative (left associative)
Mostly, these apply to integers and float64s. See their documentation for details. "*" also applies to an EList-integer pair, and has the traditional meaning of repeated addition. But remember that addition of lists has the non-traditional meaning of appending the lists together. Therefore, "a * n" will append "a" to itself "n" times. For example, "[1, 2] * 3" yields "[1, 2, 1, 2, 1, 2]". "base ** exponent %% modulus" yields a result that's mathematically equivalent to "(base ** exponent) %% modulus", but the grammar does not expand the first to the second. Rather, it expands the first directly to a call involving all three operands since, at least for integers, this can be calculated vastly more efficiently than composing the two separate calculations. ***bug: in the current parser, even it you put in the above parenthesis, it will still expand directly to the trinary operation.
Exponentiation (non associative)
Among E's built-in types, this applies only to integers and float64s in the expected way.
Unary Prefix (extremely non associative)
Among E's built in types, "!" applies to booleans, "~" to integers, and "-" to integers and float64s, all in the standard way. On integers, unary "~" and "-" yield the precision-unlimited one's and two's complement, respectively. Unary "&" exposes a deeper theory of scoping than the casual E programmer would normally ever have to be aware of. Where the expression "foo" yield the value of the variable named "foo", the expression "&foo" yields the Slot object holding the value of the variable "foo". We say that unary prefix expressions are extremely non-associative because their component expressions can only be primitive expressions. Not only do they not associate with themselves, they don't associate with the unary postfix or call expressions.
Unary Postfix (left associative)
Unary Postfix and Call are documented together below. Call (mostly left associative)
(XXX need to add verb-curry syntax to these tables) The grammar for synchronous (do it now) calls and asynchronous (do it eventally) sends is split between the Unary Postfix box and the Call box in order to resolve a parsing ambiguity. To do so, we violate our convention of presenting the grammar as a strict precedence hierarchy, in that the first grammar construct below uses postfix, meaning the first box, as part of its definition. The postfix box contains three of the shorthand forms for synchronous calls. The first form shows that the familiar array indexing notation is just syntactic sugar for a one-argument "get" message. The second form makes it straightforward to deal with JavaBeans properties. A JavaBean property "foo" is defined by a pair of a getFoo method for accessing it, and a setFoo method for setting it. The first letter of the property name is upper-cased according to JavaBeans property naming rules. Rather than writing "x.getFoo()", the E programmer may say "x::foo". To change a property, rather than say "x.setFoo(newFoo)", by virtue of the expansion of assignment, the E programmer may say "x::foo := newFoo". The third form enables one to likewise deal with JavaBeans indexed properties. The fourth shorthand form is shown in the Call box. As long as an argument list is explicitly provided (even if it's empty), you may leave out the verb, which will default to "run". Such an expression is referred to as a function call. Asynchronous calls are identified by the "<-" operator, which is read "eventually". For example, "counter <- incr()" is a request that the counter eventually increment itself. As with calls, the verb "run" may be left out. The "meta" forms are an escape for allowing an object's program to make reflective queries about itself. These reflective queries must be carefully defined so they don't violate any security properties, and the easiest way to ensure this is to provide only conveniences for what the object could have acheived more awkwardly otherwise. Therefore, they could be defined by a source-to-source transformation, although the current E definition doesn't do so. The meta forms are all introduced by the keyword "meta" so they will be apparent to static analysis. The only meta forms currently accepted are those that give an object easy access to information about itself. Classically, an object consists of state and behavior, so
These are used most often to create transparent objects -- object that auditably reveal their state and behavior to their clients. By auditably, we mean that the client does not need to trust the object's claim that it is reporting its internals accurately. Rather, it can trust an independent auditor that's in a position to check.
Primitive (no parse ambiguity to be resolved)The last step of our precedence hierarchy are the primitive expressions, whose grammar is unambiguous, so they can be the "atomic" expressions all other all built from. They are documented on a separate page. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.
|