ERights Home e-impls / e-on-c 
No Previous Sibling On to: Fat Pointers

Ground Rules

The ground rules for this exercise (at least at this point) are

  • No static type analysis of E source code. The optional SlotMaker expression in an E variable declaration is intended, among other things, to allow the programmer to give a compiler static type information about the values that the variable may hold. A sophisticated compiler could make direct use of this, and indirect use as an aid to infer further types. With static type knowledge, a sophisticated compiler could generate better code than the simple compiler proposed by ENative.

  • Only simple scope analysis. Crucial to the efficiency of E, even for a simple naive implementation such as ENative, is partitioning variables into one of several cases, and implementing variable definition, access, assignment, and slot access differently according to which kind of variable we have.

  • No use of C++'s "virtual". "virtual" is C++'s mechanism for run-time polymorphism, and its notation hides some implementation cost. By avoiding it, we ensure that all run-time polymorphism is provided only by C's pointer-to-function mechanism, leaving all costs clear.

  • No use of C++'s template feature, except possibly for the various scalar Lists. The complexity isn't worth it.

  • No unnecessary use of C++'s references. By always using pointers instead, indirection costs are always explicit. Note: if the C++ compilers of interest normally generate better code for references than pointers, which is quite possible, then this decision may be revisited.

  • Absorb, rather than reify, E's stack discipline. Approximately following Brian Smith's usage, by "absorb" I mean that E calls turn into C++ calls, E (implicit) returns turn into C++ (explicit) returns, and E exception throwing & catching turns into C++ exception throwing & catching. In the other conventional language implementation option -- reifying stack discipline -- we would implement E's implicit stack state in explicit C++ data structures. By absorbing, we also allow C++ native (primitive) code to be written in a style natural to the C++ programmer.

  • No portable virtual machine support for stack debugging. This is one of the consequences of absorbing E's stack discipline into C++'s. In the resulting code, there is no data structure accessible to portable C++ code that explicitly represents the C++ stack, and therefore no data structure that explicitly represents the E stack. For a widely ported C++ compiler & debugger combo such as gcc/gdb, it is conceivable one could provide "portable" debugging access to the E stack by wrapping gdb's provision of "portable" access to the C++ stack. I put "portable" in scare-quotes to emphasize that this would be portable across hardware platforms but not across C++ implementations.

    It is fine for this meta-level access to be much slower than normal computation.

  • Portable meta-level access to board-state in support of, for example, debugging and serialization control. E computation within a vat occurs as a sequence of atomic transactions called turns. Each turn consists of the synchronous processing to completion of a pending delivery from that vat's vat-queue. Stack state only occurs within a turn. The state of greater importance, for most meta-level purposes, is the state present in a vat between turns. If we think of a vat as a game board and a turn as a turn in a game, the states during a turn (including stack state) are transitional: "The pawn is in my hand, because I've picked it up but haven't put it back down." We call the well-defined states between turns board-states. A turn always computes from one board-state to another.

    A board-state consists of the states of the objects within the vat, the state of the vat's pending delivery queue, the states of various primitive "devices", and the state of a vat's comm system -- managing the references that cross between this vat and other vats. The E virtual machine will define portable means for accessing all of this board-state (as well as manipulating some of it). ENative must implement this spec using only portable C++ constructs, rather than, as above, relying on features provided by a particular C++ implementation. Fortunately, only object-state access is relevant to the subset of E currently represented by ENative.

    As above, it is fine for this meta-level access to be much slower than normal computation.

  • Assume a garbage collector. There are so many ways to implement garbage collection, with so many performance tradeoffs, that ENative simply defers this issue for now and naively allocates (using C++'s "new") without every freeing. Some conservative garbage collectors are happy to live underneath such C++ code and successfully garbage collect without further programmer (or compiler) attention. The overhead of a given garbage collection scheme can usually be approximated as a function of the number of objects allocated. Any benchmarks run on ENative would have to be adjusted by these estimates before they should be taken seriously.

  • Except for garbage collection, impose all costs that a complete implementation would impose. ENative, being for now an academic exercise, is, and may remain, far from a complete implementation of Kernel-E. However, of the Kernel-E programs that it does run, it should be at least as expensive as a complete implementation of ENative would be. For example, the SmallInteger add code checks for overflow. If overflow occurs, it throws a "BigInteger Not Yet Implemented" exception rather than allocating one, but this imposes all the costs on the successful case that a complete implementation would impose. The resulting measured timing may be taken as a reliable lower bound on performance (or upper bound on expected running time) except for

    1. the above garbage collection issue,
    2. a larger memory footprint (and therefore potentially a worse instruction-cache hit ratio),
    3. longer startup times, due both to the larger footprint and to more static initialization. (But still expected to be trivial compared to Java's startup time.)

  • Impose costs of upgrade-for-prototyping. This is really a special case of the above, but is listed separately since it is the least obvious cost that would be imposed by a complete implementation. To support rapid prototyping, the E specification, like Smalltalk, supports the loading of replacement behavior for already instantiated objects, and the conversion of these old object states to become post-facto instances of the new behavior. Like at least ParcPlace Smalltalk, this is done lazily at the time the obsolete objects are invoked. Unlike Smalltalk, E has not yet implemented this feature. ENative does not either, but it does impose all the costs on normal execution needed to support this feature.

    Note that an upgrade (whether -for-prototyping or -for-release) only becomes effective between turns, so it need only worry about old object-states and old undelivered messages, but not old stack-frames. This is quite fortunate, as there is nothing obvious one can do with a stale stack-frame following an upgrade. (A note to be put somewhere else: Under normal conditions, an upgrade-for-release also need not deal with old undelivered messages, since neither the vat-queue nor Eventual references are checkpointed.)

  • Compile rather than interpret, but use runtime conventions that allow either. The ENative runtime, corresponding approximately to ELib, accomodates both E code compiled into C++ according to its conventions, as well as the interpretation, by a hypothetical byte-code interpreter, of E compiled to a hypothetical byte code language. Indeed, I first designed ENative with a byte code interpreter in mind before I switched to a compilation strategy. Compiling to C++ has the obvious advantage of speed. A disadvantage that often drives implementations towards interpreters is code-size blow up. But by taking an absorption strategy, we manage to avoid this problem. The remaining disadvantages of compilation are

    1. Dynamic code loading becomes non-portable. This is much like the debugging issue above. A given widely-ported C++ system, such as gcc, may provide a platform portable way to dynamically load new code, but C++ code using this abstraction would be specific to this C++ system, while being hardware-platform independent. The E language spec requires dynamic code loading, so a complete implementation of ENative must deal with this issue. Were we to interpret byte-codes instead (or in addition), then this issue would go away.

    2. Dynamic code unloading becomes similarly non-protable. This is called out separately because it would also interact with the assumed garbage collector.

  • Call-site cacheing. Experience with dynamic object language implementations, especially Smalltalk, shows that if there's one "sophisticated" optimization that's worth doing, it's "call-site cacheing" (originally known as an "inline method cache", invented by Deutsch and Schiffman and presented at POPL 1984). For a given message call/send expression, if the dynamic implementation-type (script) of the object is the same as it was last time, as it usually is, then this call/send would look up the same method, so go there directly. ENative demonstrates that this optimization can be coded very simply. The flow of execution on a cache hit we call the fast-path. The execution on a cache miss is the slow-path. These caches must, of course, be invalidated by an upgrade-for-prototyping.


Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.
ERights Home e-impls / e-on-c 
No Previous Sibling On to: Fat Pointers
Download    FAQ    API    Mail Archive    Donate

report bug (including invalid html)

Golden Key Campaign Blue Ribbon Campaign