There have been a variety of attempts at making network programming fully transparent, some of which are ancestors of E (Actors, Concurrent Logic Programming, Concurrent Constraint Programming, Vulcan, Joule). Fully transparent means that any correct program written for a bunch of objects co-existing in a single addess space will remain correct when those objects are distributed over a network, and that any correct program written for a bunch of objects distributed over the network will remain correct when the objects in question are thrown together in the same address space. The computational model of the fully transparent systems must include only those feature that can be adequately supported in both contexts. Carl Hewitt's seminal paper The Challenge of Open Systems explains the constraints of large scale mutually suspicious radically distributed systems. The Kahn and Miller Language Design and Open Systems explains what it means to design fully transparent programming languages that satisfy Hewitt's Open Systems constraints. All the E ancestors listed above satisfy the design principles listed in this paper. At ERights.org, when we refer to distributed computation, unless we state otherwise, assume we are talking about the radical distribution of Hewitt's Open Systems. A Note on Distributed Computing argues, in effect, that the costs of these restrictions are too high for general purpose distributed computing. While we disagree with many of the particular arguments in the paper, we find the conclusions phrased much too strongly, and we find the above fully transparent systems much more plausible than this paper would suggest, nevertheless, E gives up on full transparency for the same reasons as what we take to be the main argument of the paper: There are compellingly useful features of local (single machine or single address space) computation that are not naturally available for distributed computation. It is too expensive to give up these features in the local case, where they are cheap; and for many of these, it is impossible or prohibitively expensive to support them in the distributed case. In order to support them in both cases, we must introduce a semantic non-uniformity between the two cases. (For a related but different case against full transparency, see Doug Lea's Design for Open Systems in Java.) However, we can give up on full transparency without giving up all the benefits of transparency. We define semi-transparent network programming as the second half of the above definition: Any correct program written for a bunch of objects distributed over the network will remain correct when the objects in question are thrown together in the same address space. This definition implies that the semantics available in the distributed case are a subset of the semantics available locally. E is semi-transparent, and "vat" serves the role of "address space" in the above definition.
The most compelling cost difference between the intra-vat programming and distributed programming has to do with synchrony, latency, concurrency, atomicity, and reliability. Among objects in the same vat, the familiar synchronous sequential call-return programming, in which the caller waits for the callee to return, has the following attributes:
By contrast, distributed inter-object invocation should be based on asynchronous one-way non-fully-reliable pipelined messages:
E handles this set of differences by adding surprisingly few new abstractions to the conventional set of single-addess space sequential pure object programming abstractions. In addition to the conventional NEAR reference, having all the synchrony and reliability properties available in a single address space, E introduces a reference taxonomy of other reference types including the EVENTUAL reference, which is the only kind that can span address spaces. In addition to the conventional message passing taxonomy of synchronous call-return control-flow constructs, which E only allows over conventional NEAR references, E's message passing taxonomy introduces the eventual send, which works over both NEAR and EVENTUAL references. The properties of EVENTUAL references reflect the inescapable issues of distributed systems. However, the programmer doesn't need to know whether a reference is NEAR or EVENTUAL. In keeping with the principle of semi-transparency, when the programmer doesn't know, it is always correct to treat a reference as if it is EVENTUAL. You only need to know a reference is NEAR if you want to use some functionality available only on NEAR references -- like synchronous calling.
At first, these last two picture may look identical but for the color change. But notice the reversal of the outcome arrow. As explained in Message Passing, the lightning bolt is the stack-frame in Alice in which she emitted a message to Bob. On the left, the arrow goes from the continuation part of the message to the stack-frame, implying that the stack-frame waits for a response to be sent to it along this arrow. On the right, nothing points at the sending stack-frame, so it continues after emitting the message. Moreover, this stack-frame continues holding a promise for the outcome of the message -- the tail of the arrow -- even though the outcome itself hasn't been determined yet. The arrowhead with the gray halo is the ability to determine the resolution of the promise. It serves in the continuation-role of the message delivered to Bob. Messages sent on the arrowtail of a reference always move towards the arrowhead, even while the arrowhead is en-route. This is the basis for E's Promise Pipelining. When Bob finishes reacting to a message, he reports an outcome to the message's continuation. On the left, Alice's stack-frame resumes by reacting to Bob's report. On the right, Bob's report resolves the promise that Alice holds -- the outcome of processing the message beomes the object designated by the reference. |
|||||||||||||||||||||||
Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.
|