ERights Home enative 
Back to: Fat Pointers x On to: ENative: Compiling Objects

Compiling Variables


Scope Analysis

To compile E variable usage into C++ using ENative, the E compiler must determine which of the several kinds of variable it is dealing with. But before we can make these distinctions, we first need to introduce a few definitions.

  • An allocation-contour is a coarsening of the normal notion of a scope-contour in order to aggregate variable introduction as much as possible without changing the semantics. In Kernel-E, an allocation-contour is the scope contour around a method, matcher, or loop. Given unique variable naming, the declaration of any non-outer-variable (ie, any variable within a scope box) may be moved to the closest enclosing allocation-contour without changing its semantics.

  • A lexical-composite is a group of objects defined in the same allocation-contour. It is a special case of the composite defined here.

  • Similarly, the objects defined within a lexical composite are deemed lexical-facets. The objects defined directly within a lexical composite (ie, not nested within an inner allocation contour) are direct-lexical-facets of the lexical composite.

  • A frame is the state of a lexical-composite. It's the union of the non-outer-variables used freely by any of the lexical-facets of a lexical-composite. From directly within an object (ie, not within a nested object) the frame holding the state of this object is the object-frame.

An example facet/composite is:

def getterSetterPair(var value) :any {
    def getter()    :any { value }
    def setter(newValue) { value := newValue }
    [getter, setter]
}

The getterSetterPair function defines an allocation-contour, as do the two nested functions. Since getter and setter are both defined directly within this allocation-contour (not within a nested contour), they jointly form a lexical-composite, of which they are the two direct-lexical-facets. The variables defined within this allocation-contour are "value", "getter", and "setter". However, only "value" is used freely by any of the lexical-facets, so the lexical-composite's frame holds just this one variable.

We now distinguish along several dimensions of kinds of variables usage. First, according to where the variable may live:

  • A Local is a variable defined in some allocation-contour, whose current use is directly within this object or top-level expression (ie, not within a nested object definition), and one that either hasn't been optimized into a FastField (see below), or is final (so it doesn't matter if it is also a FastField). Local variables are implemented directly as C++ local Fat Pointer variables. "getter", "setter", and "newValue" above are Locals.

  • A Field is a state variable of a lexical-composite, so it lives in the frame of that lexical-composite. It is used freely by at least one lexical-facet of that composite. Fields are implemented by indexing into the frame containing that field. From directly within an object, a field of this object's object-frame is an instance-variable. "value" is a Field of the above lexical composite. From directly within getter or setter, "value" is an instance-variable.

  • An Outer variable is one whose defining occurence is outside any scope box, and is therefore considered to be part of the outer scope. Since this scope may be outside the control of the compiler, we don't optimize this case at all, but rather fall back on the pure naive computational model: Outer variables are implemented by explicitly calling the Scope-object which is the outer scope of the top-level expression being evaluated. "any" above is an outer-variable.

    Our one (transparent) optimization of outer variables is that, within an object definition, we obtain access to the outer scope from the enclosing object's Script rather than its state-array, even though the outer scope is concenptually part of the object's state rather than its behavior. This optimization places a severe limit on separate compilation, as a Script would then be specific to the outer scope in which the top-level expression was evaluated. For various reasons, we were planning to make compilation this specific and this late anyway, so this is fine.


Deslotifying

For Locals and Fields, an earlier phase of compilation is the deslotifying source-to-source transformation. In the output of this transformation, all non-outer-variables are declared only ":settable" or ":final", to indicate whether or not they're mutable. From a lambda-calculus perspective, a ":final" variable is a pure lambda binding, while a ":settable" variable is a Scheme-like lexically shared mutable location. The transformation cases:

Before After
a :final
a
a := v
&a
same
a :final(vg)
a
a := v
&a
x :final ? vg.coerce(x) =~ a :final
a
a := v
&a
a :settable
a
a := v
&a
same
a :settable(vg)
a
a := v
&a
x :final ? vg.coerce(x) =~ a :settable
a
a := vg.coerce(v)
SettableSlot(&a, vg)
a :slotMaker
a
a := v
&a
x :final ? slotMaker.makeSlot(x) =~ a_Slot :final
a_Slot.getValue
a_Slot.setValue(v); v
a_Slot

Each box consists of four lines, corresponding to the four variable usage constructs below: definition, access, assignment, and slot-access.

XXX Note: the above after code needs a way to deal with coercion failure. This probably requires a change to the coercion protocol.

Should we build a compiler capable of aggresive inlining of both code and data, we may no longer need to deslotify as aggresively or at all. Rather, many of the cases that follow could have been generated by inlining Slots. Put another way, deslotifying, and the other optimizations below, can be seen as special-cases for Slots of various general purpose optimizations.


Allocation Type

Following this transformation, our variable usages may now be further classified according to where their storage is allocated.

  • A Boxed variable needs separately allocated storage for one mutable Fat Pointer. The compiler generates code to access and assign to this variable by directly accessing and assigning to this separately allocated storage. We refer to this separate storage as a Box. An example would be a mutable variable (one declared ":settable" after deslotifying) that is used by both a direct and an indirect lexical-facet of its defining lexical-composite. Another example would be a mutable Local for which there exists a slot-access expression. (Are there any other cases?)

    As we will see below, a Box can also serve as the state for two primitive kinds of Slots: a settable Slot, when a pointer to a Box is paired with a SettableBoxScript, and a final Slot, when a pointer to a Box is paired with a FinalBoxScript.

  • A Fast variable is one for whom the compiler was able to determine that a Box was necessary. The variable's value may be stored where the reference to the Box would have been stored, and it is accessed and assigned by C++-level access and assignment to this location.

Crossing these, we get the following five kinds of variable usage:

Fast
Boxed
Locals
Fields
Outers

 


Variable Usage Constructs

Deslotified Kernel-E has four constructs for using variables:

  • Variable Definition. The defining occurence of a variable occurs only in a FinalPattern or a VarPattern:

    FinalPattern:
    varName :ValueGuardExpr 
    VarPattern:
    var varName :SlotGuardExpr

    In both cases, when this pattern is matched against a specimen, the match always succeeds and the specimen becomes the initial value of the variable. The difference between ":settable" and ":final" only affects the other variable usage constructs.

  • Variable access. This is simply the use occurence of a variable name as an expression (for language history weenies, an "rValue"). For example, "a" in "a + b" is an access to the variable named "a".

  • Variable assignment. A use occurence of a variable on the left side of an assignment expression. The assignment expression as a whole has the value the expression on the right evaluates to, but we don't bother to show this in the implementation sketch below. This detail can often be optimized out anyway, as an E compiler should notice that most assignments are evaluated only for effect. It is a static error to assign to a value declared ":final". Such programs must be rejected at compile time.

  • Slot access. The "&name" expression evaluates to a Slot for accessing or modifying the value of variable "name". If the variable is declared ":settable", the returned Slot object will respond to both getValue and setValue(newValue) by accessing and modifying the value of the variable. If the variable is declared ":final", the Slot object will only respond to getValue.


The Cases

Each case has four rows, corresponding to the four variable usage constructs. The upper left box will show all the variable decalarions that this case applies to -- one per line. Other row-entries for that case either contain the same number of lines, meaning they apply to variables with the respective declarations, or they are a single line, meaning they apply to all of that case's possible variables.

FastLocals

A C++ local Fat Pointer variable is used to hold the value of the variable.

E C++
name :settable
name :final
Ref name = specimen
name
name
name := newValue
# not applicable
name = newValue;
//not applicable
&name
//not applicable -- would be Boxed
new FinalSlot(name)

BoxedLocals

A C++ local thin-pointer points at storage for the Fat Pointer holding the variable's value. We need not consider BoxedLocals declared ":final", as final local variable usage will always be FastLocal usage.

E C++
name :settable
Ref * namePtr = new Ref(specimen);
name
*namePtr
name := newValue
*namePtr = newValue;
&name
Ref(SettableBoxScript, namePtr)

FastFields

An element of a state-array holds the current value of the variable. The state array itself is accessed by a C++ local thin pointer variable, here named "frame". When the frame is our object-frame (the frame holding the instance variables for the current object), it is initialized as

Ref *frame = self.myData.word.myField;

otherwise it is initialized at the time the frame is allocated:

Ref *frame = new Ref[numVars];

E C++
name :settable
name :final
frame[index] = specimen
name
frame[index]
name := newValue
# not applicable
frame[index] = newValue;
//not applicable
&name
Ref(SettableBoxScript, &frame[index])
Ref(FinalBoxScript, &frame[index])

Note that "&name" on a FastField variable returns a Slot object that points directly into the middle of the frame in order to point at the variable's storage. This technique requires a garbage collector able to handle pointers into the middle of allocated blocks. When using a more limited garbage collector, an explicit FramedFieldSlot object should be allocated that points to the frame as a whole as well.

BoxedFields

The state-array contains a reference to an arbitrary object expected to exhibit Slot behavior.  Access or assignment is by explicit message send to this Slot object. We need not consider BoxedFields declared ":final", as final field variable usage will always be FastField usage.

E C++
name :settable
frame[index] = Ref(SettableBoxScript, new Ref(specimen));
name
frame[index].myData.word->myBox
name := newValue
frame[index].myData.word->myBox = newValue;
&name
frame[index]

Outers

Variable usage is by explicitly messaging a scope object that represents the top level scope of the lexically enclosing top level expression. This scope object is accessed through the C++ local Fat Pointer variable "Outers". Within an object expression, Outers is initialized by

Ref Outers = self.myScript->myOuters;
E C++
name :slotMakerExpr
Ref initSlot = slotMaker.call(&DoMakeSlot, specimen);
//check for null
Outers.call(&DoDefineSlot, Ref("name"), initSlot);
name
Outers.call(&DoGet, Ref("name"))
name := newValue
Outers.call(&DoPut, Ref("name"), newValue);
&name
Outers.call(&DoGetSlot, Ref("name"))
 
Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.
ERights Home enative 
Back to: Fat Pointers x On to: ENative: Compiling Objects
Download    FAQ    API    Mail Archive    Donate

report bug (including invalid html)

Golden Key Campaign Blue Ribbon Campaign