Last updated: [98/09/15 Bill] [98/09/16 Bill] Added a flow control protocol for each TCP stream and a description of the server logic. [98/09/17 Bill] Added client implementation logic. Added information on configuration for firewall HTTP proxys. Added description of weak and strong authentication for HTTP_Logon to eliminate some kinds of denial of service attack. [98/09/18 Bill] Eliminate the unauthenticated logon messages, and define the wire format of the messages. Author: Bill Frantz (frantz-at-pwpconsult.com). IntroductionThis document describes some ideas for extending the DataComm system to operate through various types of firewall. There are four basic levels of problem:
Related DocumentsSee New E Data Comm System for information about the E Data Comm System. See DataComm Startup Protocol for information on the start up protocol. RequirementsThe basic requirements is that the E Data Comm system be able to operate through firewalls without special configuration of the firewall. Furthermore this operation should be possible without the cooperation or permission of the firewall operator. ArchitectureHTTP TunnelingHTTP Tunneling works by sending POST requests to a "HTTP server" and receiving replies. If the firewall allows us to use HTTP on any port, then we just need the DataComm HTTP Server code. Otherwise, if the machine must also support a real HTTP server, we will use a CGI to redirect the request to the non-port 80 server. Note that the Java virtual machine is configured to use a firewall proxy with the Java system properties: http.proxyHost and http.proxyPort. After this configuration has been set, the URL will use the firewall proxy to contact hosts outside the firewall. If we can use HTTP/1.1 instead of HTTP/1.0, we may be able to take advantage of the reusable TCP connections which are supported in HTTP/1.1. In 1.1, everyone along the path, including proxys and firewalls has the option of tearing down the connection after the first round trip, but even if it only helps on some cases, it would be worth while. The POST request can be sent through a URLConnection generated from a URL which specifies a protocol of "http", a host-name, port-number, and a path or cgi reference. conn = url.openConnection(); conn.setDoOutput(true); conn.setUseCaches(false); conn.setRequestProperty("Content-type", "application/octet-stream"); inNotifier.deactivate(); in = null; return out = conn.getOutputStream(); When the input stream is first read, the buffered output is sent: outNotifier.deactivate(); out.close(); out = null; // An HTTP error will either show up as an IOException, or it // will show up as the error response. If the content type is // not "application/octet-stream", then we are dealing with an // error response. try { in = conn.getInputStream(); } catch (IOException e) { throw new IOException("HTTP request failed"); } String contentType = conn.getContentType(); if (contentType == null || !conn.getContentType().equals("application/octet-stream")) { throw new IOException("HTTP request failed"); } return in; The request starts out with a fixed header of POST and must include Content-length: header. To be complient with HTTP (see RFC 2068), it must also include Content-type:. POST <URI-requested> HTTP/1.0\r\n Content-type: application/octet-stream\r\n "Content-length: " + sizeOfData + "\r\n" The reply starts out with a fixed 200 OK header, and also includes Content-length: and Content-type:. HTTP/1.0 200 OK\r\n If there is a client error, the reply is a fixed 400 Bad Request header. HTTP/1.0 400 Bad Request - <message>\r\n Followed by the data as a transparent byte stream. The system allways sends a least one byte of data (a nul) to support clients that object to zero bytes of data. The receiver must discard any extra data which follows the sizeOfData bytes in the byte stream. The receiver must also skip all the other headers until it reads the blank line (or line consisting only of line terminators or whatever). DataComm HTTP Server
The DataComm HTTP server process acts as a remote proxy for the firewalled vat client. The proxy supports a listen address where the vat can be contacted, and several TCP links to other vats. The protocol between the client and the proxy server identifies the TCP link with which a set of data is associated. Note that since normal vat-to-vat authentication and privacy measures are used, the client to proxy link does not need either encryption or authentication. However, some level of authentication would help discourage denial of service attacks. Since all communication is driven by the client, the proxy needs to be able to time out the client listen address and the TCP links. See also Server Design. Note on Timeouts: It is possible that a slow link will result in it taking longer for the clent to send a HTTP_Session message than the server timeout. If the server can detect that the client has started sending a message, it can then use continued progress in receiving the message as the timeout criterien rather than just receipt of the message. This kind of timeout is straight forward when the client is connected directly to the server. I don't know if it is possible when the client messages are being redirected by a CGI. Client - Proxy Message FormatsThis protocol uses messages formatted with java.io.DataOutputStream. The protocol uses writeUTF(), writeByte(), writeShort() (read with readUnsignedShort()), and write(byte[]) in sending the data. In the descriptions below, the first three are refered to as UTF, byte, and short. The notation "byte[]" is also used. All byte[] parameters are assumed to be proceeded by a short giving the length of the byte array. The HTTP_Logon message includes a list of acceptable protocol version numbers. The versionID described in this document is "T1". All messages between the client and the proxy are carried in HTTP envelopes as described under HTTP Tunneling. Each of the major messages types (HTTP_Logon, HTTP_Session, HTTP_Shutdown, and HTTP_Error) are carried in a separate HTTP interchange. Message Type codesAll message and response types are single bytes. The assigned values are: HTTP_Logon = 0x01; HTTP_Session = 0x02 HTTP_Shutdown = 0x03; HTTP_Error = 0x04; HTTP_Logged_On = 0x05; HTTP_Set_Server_Nonce = 0x06; The subtypes of HTTP_Session are assigned values HTTP_NewConnection = 0x10; HTTP_Data = 0x11; HTTP_OK_To_Send = 0x12; HTTP_Close = 0x13; HTTP_InvalidID = 0x14; HTTP_ConnectionFailed = 0x15; HTTP_ConnectionComplete = 0x16; Client AuthenticationThere is a trade off between server performance and the ability of a hostile user to cause denial of service attacks on the server and its clients. Most of these attacks can be eliminated by authenticating the logon message and using the VatID to control access to the server (for billing or to eliminate bad actors). The server can check three levels of authentication. If the server never checks signatures, anyone who knows the vatID and the server URL can deny service to that vatID by sending a HTTP_Logon with that vatID. If the server checks the clientNonce, it protects against this attack by requiring an attacker to have a signed HTTP_Logon message. However, the message could come from having snooped the vat's communications. If the server checks require that the vat sign a random number provided by the server, and the server saves the last number it issued the client and makes sure the client is returning that number, then the server knows it is communicating with the client. The server should also ensure that the vatID is the hash of the public key. The server can dynamically decide how much authentication to require. A policy of only checking authentication if the vatID is already logged on seems reasonable. Message Descriptions<byte HTTP_Logon> <UTF VatID> <byte[]serverNonce> <byte[]clientNonce> <byte[]publicKey> <byte[]signature> - Indicates that <VatID> wants to use the server as a proxy. The serverNonce is a random number generated by the server, the clientNonce is a random number generated by the client, the public key is the client's public key, and the signature is the DSA signature over the sequence (as transmitted) <HTTP_Logon> <VatID> <serverNonce> <clientNonce>. The first time the client sends this message, it specifies a zero length serverNonce. The responses are: <HTTP_Logged_On> <byte[]sessionID> <UTF listenAddress> - Indicates that the logon is successful and provides a sessionID for the session. The sessionID is sufficently large (64 bits?) that a hacker who is not tapping the communications between the client and the server can not easily guess it and interfere with the service. The listenAddress is the host:port the server is using to listen for connections to this vat. <HTTP_Set_Server_Nonce> <byte[]serverNonce> - Indicates that the serverNonce in the logon message was missing or invalid. The client should resend the HTTP_Logon message using the serverNonce in this message. The HTTP_Session message is used in both directions to pass data to the proxyed TCP connections, open new TCP connections, respond to new TCP connections and close TCP connections. The HTTP_Session message consists of a header and zero or more data segments. (A HTTP_Session message with zero data segments act as Ping/Pong message.) The client must send an HTTP_Session message every n (60?) seconds or the server will shutdown the session. Messages which describe a specific TCP connection use a <ConnectionID> parameter. This parameter is an byte, limiting the maximum of proxied TCP sessions active to 255. Positive values are assigned by the client for outgoing connections. Negative values are assigned by the server for incoming connections. The value zero is not legal. Each TCP connection has its own flow control. Both the client and server should limit the amount of data they send to a connection to the value in the last HTTP_OK_To_Send message for that connection. The header is: <HTTP_Session> <byte[]sessionID> Any number of data segments may be included in the message. The legal data segments are: <HTTP_NewConnection> <byte connectionID> <UTF HostID:port> - Client to host only. Build a TCP connection to the specified host and port, and use connectionID to refer to it in subsquent messages. Responses are not necessarily returned in the same exchanges. They are: <HTTP_InvalidID> <byte connectionID> - The connectionID passed is invalid because either there is already a connection using that ID, or because the ID has the wrong sign. All connections with that ID are closed. <HTTP_ConnectionFailed> <UTF reason> - The connection could not be make. <reason> is a textual message describing the reason for the failure. <HTTP_ConnectionComplete> <byte connectionID> - The connection is ready to accept data. <HTTP_NewConnection> <byte connectionID> <UTF host:port> - Server to client only. A new TCP connection has been established to the server's listen port for this VatID. The host and port are those of the remote end of the TCP connection. <HTTP_Data> <byte connectionID> <byte[]data> - Indicates data to be passed to/received from the TCP connection. <HTTP_OK_To_Send> <byte connectionID> <short bytesOfData> - Indicates that the other end may send up to bytesOfData to the connection connectionID. <HTTP_Close> <byte connectionID> - Closes/indicates the connection has been closed. <HTTP_Shutdown> <byte[]sessionID> - Ends the session between the client and the server. The server closes all the TCP connections it has open on behalf of the client and stops listening for new connections to the client. If the shutdown was initiated by the client, the response (server to client) is: <HTTP_Shutdown> <byte[]sessionID> - Shutdown complete <HTTP_Error> <byte[]sessionID> <UTF reason> - An error occured on the session and it must be shut down. reason is a textual message describing the error. Possible errors are: If an HTTP_Error message is received by the server, the response will be an HTTP_Shutdown message. Off the shelf alternativesThe transport layer of RMI uses similar techniques, but it is not an exposed interface. Other Design Objectives, Constraints and AssumptionsCurrent implementationServerThis server design is a reference implementation. It is designed for clarity, not efficency. Being written in Java, it uses threads out the yingyang. HTTPServeMain is the class which contains the main routine. It also listens to the HTTP port. HTTPServeClientPeer is the class which handles HTTP input and output for a particular client. HTTPServeClientState is the class which holds the client state between HTTP messages. None. Design ProposalServerThe server waits for E connections on one port and HTTP requests on another. When the server gets an HTTP_Logon message, it builds the necessary data structures to service that vatID, generates a sessionID, and sends the sessionID in the HTTP response. The data structures include:
The basic dataflow logic for various messages is:
New incoming TCP E connection The server reads the new socket and saves the PROTOCOL_VERSION message (see Comm Connection Startup Protocol). It saves and reads the IWANT message and checks if it is proxying for the requested VatID. If the VatID is not known, it generates a NOTME response and closes the socket. Otherwise it associates the socket with the appropriate HTTP connection and generates three HTTP_Session submessages for the HTTP_NewConnection, the HTTP_OK_To_Send, and the HTTP_Data which are queued for the HTTP connection. Incoming data on the TCP E connection The server reads the data and queues it on the appropriate HTTP connection as a HTTP_Data message. Incoming close on the TCP E connection The server queues a HTTP_Close message on the appropriate HTTP connection. Incoming HTTP message from the client The data portion of the HTTP POST operation is read and the embeded messages are processed. When they have been processed, the output queue for the HTTP connection is encoded and sent back in the response. Note that the output queue is a FIFO queue to preserve the ordering of events. The specific POST messages are handled as follows: HTTP_Logon from the HTTP client If there is already a session in progress for this VatID, the server performs the following checks: Generate a new sessionID, build the necessary data structures, and queue a HTTP_Logged_On message as the response. Each subtype is processed as follows: HTTP_NewConnection from the HTTP client The server checks the parameters to ensure they are valid. If they are not valid, an error response is queued for the HTTP connection. Otherwise an asynchronous operation is started to build the TCP connection. It will report its success or failure to the HTTP queue when it has finished. HTTP_Data from the HTTP client The data is queued for the appropriate TCP connection. When the data has been sent, a new HTTP_OK_To_Send messages is queued for the HTTP client. HTTP_OK_To_Send from the HTTP client The server updates its send limit for the connection. HTTP_Close from the HTTP client The designated socked is closed synchronously. HTTP_Shutdown from the HTTP client All TCP connections are closed. An HTTP_Shutdown message is queued and all the queued messages are included in the response. All the data structures associated with the session are discarded. HTTP_Error from the HTTP client This message is handled in the same way as an HTTP_Shutdown message. ClientThe client code involves changes to the current DataComm software. There are two obvious versions of the client that can be imagined:
The client that supports both direct and tunnelled connections has a number of problems to solve:
Extend the VatIdentity class to have a getVatTPMgr(URL url) method. The url specifies the HTTP Tunnel server. Change DataComm to use a SocketFactory to get its Sockets. For direct connections, this factory returns standard system Sockets. For Tunnel connections, a different factory returns Sockets which use the HTTP Tunnel classes for communication. For incoming connections, the HTTP Tunnel classes can call VatTPMgr.newInboundSocket directly or through a Thunk. The Tunnel classes can directly return the address the server is listening at to the VatTPMgr using the listeningAt(String) method. The TunnelSocket will respond to as follows to the standard Socket methods:
The following methods will be implemented as NOPs sufficent for DataComm's use, or will throw exceptions.
Tunnel and Direct With the above Tunnel Only architecture and some additional changes, there are simple answers to the Tunnel and Direct questions. The changes are to allow more than one socketFactory to be active in the objects under a particular VatTPMgr. The use of multiple factories also allows the vat to listen on more than one interface:port.
Try them all. First try all the search addresses through each direct connection interface. Then try all the URLs registered for Tunnel connections. All the addresses it is listening at. Even if they are not relevant to a particular network, trying to connect to them will fail unless there is a vat with the desired private key listening there. By running under one VatTPMgr, duplicate VatTPConnections will be prevented.
Which directories on our tree does this subsystem cover?org/erights/e/net/data Is it JavaDoc'ed?In many cases, this section can link to JavaDoc output from actual Java classes and interfaces. This saves writing documentation twice (the designers will have to JavaDoc their interfaces anyway). The JavaDoc should be linked into the design document. Chip's JavaDoc style guidelines (XXX file missing) explain how to use JavaDoc effectively. ExamplesTesting and DebuggingSee DataComm Testing. Design Issues
The user uses the Java system properties: http.proxyHost and http.proxyPort to set the host and port of the firewall proxy. |
||||||||||||
Unless stated otherwise, all text on this page which is either unattributed or by Mark S. Miller is hereby placed in the public domain.
|