*** incoherent notes here to the end of this file. Do Not Read **** Mozart / OZ
Cardelli, Vijay Sun, NeXT, Kaleida Labs, Waterken Rees' uneval XMLEncoderjavadoc jdj article sun article java.beans.ExpressionJOSS
In JOSS, implementing Serializable declares that one's state should be divulged to all serializers, independent of whether one is transparent in general. We can say that such an object is transparent to all serializers. To be transparent to all serializers without being transparent in general makes sense only if all serializers are only wielded by objects that are universally more trusted than are normal clients. While this may make sense for supporting applets, since we are trying to support mutual suspicion, our idealized model instead only distinguishes transparent and opaque.
Putting together some of these elements into a silly example, if p is a point created by the following code in module foo.bar: var count :int := 0 def makePoint(x :int, y :int) :any { count += 1 def point implements Transparent { # ... required self revelation ... to getX() :int { x } to getY() :int { y } /** How many points have been created? */ to getCount() :int { count } } } def p := makePoint(3, 4) then uneval(p) would yield the string: <import:foo.bar$makePoint__C$point__C>.instatiate([ "x" => 3, "y" => 4, "count__Slot" => settable.makeSlot(1) ]) Given the decision to save data but not code, it seems a bit ironic that the eval technique saves the data by turning it into code. (This is based on the mapping: E module => top level Java class. E behavior (ie, object definition expression) => Java nested class. Since nothing in E maps onto static methods of these nested classes, we are free to specify that each of these nested classes has the static instantiate method assumed above.)
ReplacementJOSS serialization has many customization hooks, but that provided by the replaceObject method seems to be universal. All the other serialization-customizations that concern us can be expressed, perhaps at a loss in efficiency, by overriding only this one method. In the example shown earlier, we specified the null customization by providing the identity function as the replace function. What if we provide a different function? ? pragma.syntax("0.8") ? def makeSerializer := <elib:serial.makeSerializer> # value: <makeSerializer> We define the replace function to be one that replaces an integer with the next higher integer, but acts otherwise as an identity function. ? def sillyReplacer(obj) :any { > if (obj =~ i :int) { > i + 1 > } else { > obj > } > } # value: <sillyReplacer> ? def sillySerializer := makeSerializer(sillyReplacer) # value: <a Serializer> ? def desc := sillySerializer.record([1, 2, 'c', "foo"]); desc.size() # value: 360 As we see, the reconstruction is just like the original except that each integer, no matter where it may appear, has been replaced by its successor. This is specified as a mostly trivial variation of our earlier model of serialization. As a first approximation, whereas previously the serialization cases may have looked like: def uneval(ref) :any { switch (ref) { # ... the above cases using ref ... } } we would instead have def uneval(originalRef) :any { def ref := replace(originalRef) switch (ref) { # ... the above cases using ref ... } } We now need to adjust this to form broader equivalence classes for our notion of "having already serialized" a reference. In a call to uneval(A1), uneval will call its replace function which, let's say, returns ARep, which then gets serialized as expression AData. The serializer then remembers both A1 and ARep as having been seen, and as having produced expression AData. If either are seen again, then a variable is generated as before, but this variable is associated with both references. If A1 is novel but ARep has already been seen, then A1 is added to the equivalence pool for the ARep, and a variable reference is generated. (*** In the case of a previously seen variable, is replace called? Should it be? Replace is called when the original is already transparent. Should it be?) A serializer can thereby be cutomized to serialize graphs containing opaque objects, so long as the replacement for these objects is transparent. We will make much use of this ability. Traversal proceeds using the fields of the transparent replacement. By the above steps, the serializer traverses a virtual graph of transparent replacements. This graph is never represented as an actual graph of objects, and often it cannot be. If replacement ARep's instance variable myB points at original B1, and if the replacement for B1 is BRep, then in the virtual replacement graph ARep's myB points at BRep, even though myB may be of a type incompatible with BRep. it is fine for a painter to depict a scene that's not literally possible, so long as its meant for viewers who know how to interpret this depiction -- so long as corresponding unserializers are customized to reconstruct a possible graph. Cycles vs. ResolutionUnfortunately, this simple story faces a terribly hard problem. When the depicted XRep's myY points at the depicted YRep, then we must initialize the evolvedX's myY to point at the appointed resolution of Y, YRes. So long as there are no cycles, JOSS does exactly this, as it should, by instantiating and resolving objects in child-to-parent order. But when there are cycles, JOSS fails in unacceptable ways:
When building a system for the coexistence of mutually suspicious objects, any spec that says "under these conditions an incorrect result will be returned" must be read as an opportunity for an adversary. Were they to purposely induce an incorrect result, how might they disrupt the assumptions of others in ways they might exploit? These issues are usually hard enough to think about that it is better to fix the spec to remove the opportunity. Unfortunately, the above problem seems unavoidable in Java, since a Java reference cannot exist prior to the existence of the object it designates. If evolvedX is to be initialized with YRes, and evolvedY is to be initialized with XRes, and if a fully initialized evolvedX is going to participate in the calculation of XRes, and if a fully initialized evolvedY is going to participate in the calculation of YRes, there would seem to be no order that would satisfy all these constraints. Our idealization mostly solves the problem by using E's promises to cheat on the notion of "fully initialized". E's promises -- like logic variables in logic programming languages, or futures [ref Actors], or Channels [ref Joule] -- are references which exist, can be passed around, and can be held before it is determined which object they designate. Until their target object is determined, we say that the promise is unresolved. Once the target is determined, the promise is fulfilled and is then equivalent to any normal reference to that target. When the unserializer, walking from child-to-parent, encounters a depicted reference to an ancestor of the current object, it creates a promise to represent that reference, and uses that promise to initialize that field of the evolved object. Later, when the appointed resolution of the ancestor is produced, the promise is resolved to this appointed resolution. Unfortunately, if the field being initialized is declared to require a non-promise type, which can happen in either Java or E, then unserialization will still fail inappropriately with a thrown exception. But this algorithm will never silently return an incorrect graph, which is the important property for security. The remaining problem simply gives an adversary an opportunity to mount a denial of service attack, which E generally does not claim to defend against. |
||||||||||||
Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.
|