“"> ”"> —"> à"> ]>
Freenet Protocol 1.0 Specification Lee Daniel Crocker lee@piclab.com Lee Daniel Crocker lee@piclab.com Ian Clarke i.clarke@dynamicblue.com Brandon Wiley blanu@uts.cc.utexas.edu Oskar Sandberg wd98-osa@nada.kth.se Hal Finney hal@finney.org Public Domain NOTE: This document is inaccurate and incomplete. It is a work in progress, and many details here have yet to reach general consensus among the developers. Do not rely on this document to create software implementing this protocol without contacting the project developers for the latest information. An older (and also incomplete) specification is available here. The Java code should be considered the best documentation for the behavior of nodes, even though it doesn't quite comply with the message format specified here. This document describes Freenet, a networking protocol created by Ian Clarke based on his 1999 paper A Distributed Decentralised Information Storage and Retrieval System. Freenet is designed to allow storage and retrieval of documents in a manner that is resistant to censorship, to facilitate the free exchange of ideas in environments that might otherwise be hostile to those ideas. Further information about Freenet and the project to implement such a network can be found at freenet.sourceforge.net. Freenet protocol includes a message packet format in which all communication between hosts is encoded, a set of standard behaviors expected of hosts that receive these messages, and methods for transporting these messages. Introduction Organization of This Document Section 1 of this specification describes the goals of Freenet protocol, the rationale for its creation, its features, and some general ideas behind its implementation. This is not intended to be normative, but rather to explain the motivation behind these features to aid in understanding the protocol itself. Section 2 describes the layout of message packets sent between hosts (called nodes) of a Freenet network. All communication in this protocol is encapsulated in these messages. This section is solely a description of syntax, and does not specify the semantic content of these messages or the behavior of nodes that send and receive them. Section 3 describes in more detail the behavior of Freenet nodes necessary for a functioning network. Most of this behavior is implemented by sending messages to other nodes in the syntax described in section 2. This section describes the content and meaning of those messages, as well as other behaviors required of functioning nodes. Section 4 details the behavior of each specific message type sent and received by Freenet nodes. Section 5 describes the methods by which nodes transmit messages to each other. Section 6 describes the keys by which Freenet documents are stored and retrieved, and the closeness algorithm nodes use to route requests for these documents. Finally, section 7 suggests some standard ways to perform functions many Freenet client programs may need that aren't part of Freenet protocol proper. Rationale The creators of Freenet are concerned that present methods of information sharing on the Internet suffer problems stemming from overly centralized control: problems such as bottlenecks created by narrow supply channels serving broad demand, contention over name space, and government censorship. Much of this centralization is built into the popular protocols such as DNS (Domain Name Service), HTTP (Hypertext Transport Protocol), and others. For this reason, we feel it is necessary to propose new protocols that can help overcome some of these problems. The protocol described here was designed with these goals in mind: Anonymity for both publishers and consumers of information. Adaptive optimization for efficient caching and delivery of information. Resistance to flooding and other denial of service attacks. No central control of any critical function of the network. The protocol was explicitly not designed to accommodate reliable or permanent storage of information. Overview The end goal of a Freenet network is to store Documents and allow them to be retrieved later by Key, much the same as is now possible with protocols such as HTTP. The network is implemented as a number of Nodes that pass Messages among themselves peer-to-peer. Typically, a host computer on the network will run the software that acts as a node, and it will connect to other hosts running that same software to form a large distributed network of peer nodes. Certain nodes will be end user nodes, from which documents will be requested and presented to the human user. But these nodes communicate with each other and with intermediate routing nodes identically — there are no dedicated clients or servers on the network. Freenet protocol is intended to be implemented on a network with a complex network topology, much like IP (Internet Protocol). Each node knows only about some number of other nodes that it can reach directly (its conceptual "neighbors"), but any node can be a neighbor to any other; there is no hierarchy or other structure. Each document (or other message such as a document request) in Freenet is routed through the network by passing from neighbor to neighbor until reaching its destination. As each node passes a document to its neighbor, it does not know or care whether its neighbor is just another routing node forwarding information on behalf of another, whether it is the source of the document being passed, or whether it is a user node that will present the document to an end user. This is intentional, so that anonymity of both users and publishers can be protected. Each node maintains a data store containing documents associated with keys. With each document it also stores the address of another node where that document came from (and possibly some limited metadata about the document). In addition it may have some keys for documents that have been deleted (due to lack of use, memory limits, etc.), but in that case it also retains a pointer to another node that may still have the data. To find a document in the network given a key, a user sends a message to a node (probably one running on the same machine as the client program) requesting the document, providing it with the key. If no matching key is present in the local data store, the node then finds the "closest" key it does have (where "close" is defined appropriately as specified later) and forwards the request to the node associated with that key, remembering that it has done so. The node to which the request was forwarded repeats the process until either the key is found or a number of hops is reached to indicate failure. Along the route, if a node is visited more than once (and it will know this because it remebered forwarding the request the first time) then that node cuts off the loop by sending a message to the node that sent it the second request telling it to start looking at the node associated with the next-closest data item, the next-next-closest, and so on. Eventually either the document is found or the hop limit is exceeded, at which point the node sends back a reply that works its way back to the originator along the route specified by the intermediate nodes' records of pending requests. The intermediate nodes may choose to cache the document being delivered to optimized later requests for it. Essentially the same path-finding process is used to insert a document into the network, the document being stored at each node along the path. When a new node starts up, a "dummy" entry with a random key is inserted into its data store for each other node it knows about. Then when an initial request enters the node, one or more of these dummy entries will be the closest. This will determine which other node requests are forwarded to first. Initially, then, each node has a purely random set of keys for every other node that it knows about. This means that the nodes to which it sends a given data item will depend entirely on what these random keys are. But since different nodes use different random keys, each node will initially disagree about where to look for or send data, given a key. The data in a newly-started Freenet will therefore be distributed somewhat randomly. As more documents are inserted by the same node, they will begin to cluster with data items whose keys are similar, because the same routing rules are used for all of them. More importantly, as data items and requests from different nodes "cross paths", they will begin to share clustering information as well. The result is that the network should self-organize into a distributed, clustered structure where nodes tend to hold data items that are close together in key space. There will probably be multiple such clusters throughout the network, any given document being replicated numerous times, depending on how much it is used. This is a kind of "spontaneous symmetry breaking", in which an initially symmetric state (all nodes being the same, with random initial keys for each other) leads to a highly asymmetric situation, with nodes coming to specialize in data that has certain closely related keys. There are forces which tend to cause clustering (shared closeness data spreads throughout the network), and forces that tend to break up clusters (local caching of commonly used data). These forces will be different depending on how often data is used, so that seldom-used data will tend to be on just a few nodes which specialize in providing that data, and frequently used items will be spread widely throughout the network. One thing to keep in mind is that keys are presently hashes, hence there is no notion of semantic closeness when speaking of key closeness. Therefore there will be no correlation between key closeness and similar popularity of data as there might be if keys did exhibit some semantic meaning, thus avoiding bottlenecks caused by popular subjects. Message Packet Format All information transported among nodes in a Freenet network is packaged into messages structured as detailed here. IMPORTANT: Examples in this section use names and values to illustrate the packet format that may or may not be legal names and values for a real packet in any version of the protocol as specified in following sections of this document. This section is normative only in describing the structure of message packets, not their content or meaning. Organization Freenet is a binary protocol, in that every octet sent over the wire is used to decode the message packet, and documents of arbitrary binary content can be packaged without encoding. An 8-bit clean communications channel is required, and the channel must not perform any operations such as removing extra spaces, converting tabs, or converting character sets. However, the layout of Freenet messages is designed to make them human-readable and compatible with text-based tools on common systems, especially for messages that do not include documents or that include documents which are plain text. Here's a sample message with a plain text document: Reply.Data UniqueID=C24300FB7BEA06E3 Depth=10 HopsToLive=54 Source=tcp/127.0.0.1:2386 DataSource=tcp/192.235.53.175:5822 Storable.InfoLength=0 DataLength=131 Data 'Twas brillig, and the slithy toves Did gyre and gimble in the wabe: All mimsy were the borogoves And the mome raths outgrabe. A Freenet message is sequence of octets broken into three variable-length sections: A Message Type Identifier that identifies the general class of the message. A collection of Header Fields that provide routing information, metadata, and other details. A Trailing Field that optionally contains the payload of the message (for those messages that carry documents or other payload). The message type identifier and header field sections of the message are composed of text lines, each a sequence of octets in the range 0x20 to 0xFF, terminated by a line end sequence composed of an optional octet of value 0x0D (ASCII CR) and a required final octet of value 0x0A (ASCII LF). The three sections are separated as follows: The first text line of the message (i.e., the octets beginning with the first up to but not including the first line end sequence) is the message type identifier. In the example above, this is the string Reply.Data. After the first text line of the message, all text lines up to but not including the first text line that does not contain the ASCII = character (octet value 0x3D) are header fields. In the example above, these are the lines from UniqueID=C24300FB7BEA06E3 to DataLength=131, inclusive. The text line after the header fields, and all octets following it, comprise the trailing field. In the example above, this is the text line Data and all the following bytes (which happen to be text lines from the first stanza of Jabberwocky). Message Type Identifiers The message type identifier is structured as a sequence of one or more Class Identifiers separated by the . character (octet value 0x2E). Each class identifier must begin with an alphabetic character ([ 'A'..'Z', 'a'..'z' ]) and contain only alphabetic and numeric ([ '0'..'9' ]) characters. The . character may not appear at either the beginning or end, and no more than one may appear in sequence. The line end sequence that terminates this line is not part of the message type identifier. This structure represents a class hierarchy of message types. Header Fields Each header field line begins with a Field Identifier, followed by the = character (octet value 0x3D) and a sequence of octets representing the value of the field. In the example above, the field named Source has the value tcp/127.0.0.1:2386. Whitespace may not appear anywhere before the first = character, and any whitespace after that is significant, forming part of the field value. The line end sequence that terminates the line is not part of the field value. If the = character is followed immediately by the line end sequence, the value of the associated field will be the zero-length string. Field identifiers have the same structure as message type identifiers: a dot-separated sequence of identifiers, representing a class hierarchy of field types. Only one header field of any type may be present in the header, but it is acceptable for multiple sibling header fields to share parent supertypes. For example, a message containing the lines Count=5 and Count=7 is malformed, as is a message containing the lines Count=5 and Count.Extended=12. A message containing the fields Storable.PublicKey and Storable.Expires is valid. The octets making up the field value are interpreted as UTF-8, with the resulting field value therefore a Unicode string with code points in the range 0x0020 to 0xFFFF. The header section of a message ends when a text line is encountered that does not contain the = character (this line will be either Data or EndMessage). Data and EndMessage may not be used as header field names, but names that would be a subclass of one of these may be used. For example, the line Data=34 may not appear among the header fields, but the line Data.Encryption=Twofish may. There are other restrictions on the legal names and values of header fields based on their semantics as described in later sections of this document. Trailing Field/Payload If the text line which ends the header section is the string EndMessage, then the message as a whole ends at that point and there is no other data present in the trailing field. If the string is Data, then the trailing field contains a payload, whose length must have been specified by the value of the DataLength header field. The value of DataLength must be ASCII decimal digits representing a decimal integer in the range 1 to 9223372036854775807. The number of octets specified by this value will be read immediately after the terminating sequence of the Data line. There are no restrictions on the value of octets to be found in the data, and they must not be interpreted or modified in any way by the transport mechanism. It is an error to encounter an unexpected end-of-file or similar condition while reading any part of the message packet. Some transport methods may send multiple concatenated messages to a node over the same connection. In this case, the second message is assumed to begin immediately after the last octet of the trailing field of the first message has been read (if there was no data in the trailing field, this will be immediately after the line termination sequence of the EndMessage line). Nodes should not look for these extra messages unless the transport method specifies that it should (for example, by setting a keepalive option earlier in the message header). Formal Grammar Here is a more precise grammar for the message packet format described above: OCTET = < 0x00..0xFF > CHAR = < 0x20..0xFF > ALPHA = < 0x40..0x59, 0x60..0x79 > DIGIT = < 0x30..0x39 > CR = < 0x0D > LF = < 0x0A > NewLine = [CR] LF SubFieldName = ALPHA *(ALPHA | DIGIT) FieldName = SubFieldName *("." SubFieldname) FieldValue = *CHAR MessageType = FieldName NewLine HeaderField = FieldName "=" FieldValue NewLine EndMessage = "EndMessage" NewLine TrailingName = "Data" NewLine Payload = *OCTET MessagePacket = MessageType 1*HeaderField EndMessage | (TrailingName Payload) General Node Behavior In a functioning Freenet network, each node communicates with others by means of messages formatted as detailed in the previous section; but this alone is not sufficient for the node to be a functioning part of a Freenet network. The content of the messages, their meaning, and the specific behavior of nodes that send and receive them must also conform to the standard behaviors specified in this section. We'll begin by describing the general content and meaning of Freenet messages, and the general behvior of Freenet nodes when sending and receiving them. It is important to stress, however, that the behavior of a node receiving a message is dependent upon the specific message type and the circumstances of its transmission, as described in the following sections. Message Contents As described above, messages contain a type identifier, a collection of header fields, and a trailing field with optional payload. The behavior of a nodes sending or receiving messages depends only upon the first two: nodes do not interpret, parse, or in any way act upon the payload of a message other than to store it and/or pass it on to other nodes. Transport Options Before a received message is further acted upon by a node receiving it, the code implementing the message transport mechanism may look at the header fields of the message for header fields named TransportOption or any subclass thereof. These header fields are interpreted by the message transport code to modify its behavior, after which they must be deleted from the message before dispatching it to further processing by the node. It is permissible for transport options set in one message to apply to the transmission of later messages over the same connection. This behavior, if desired, must be clearly specified by the particular transport option. Message Types The message type identifier at the start of a message specifies the intended purpose of the message. Upon receiving any message (after transport options, if any, are interpreted and removed), a node must use this to dispatch the message to appropriate handling routines. All further processing of the message takes place in the context of the message type. In particular, the behavior of header fields (even those common to all messages) may be modified by a specific type to suit its purposes, so a node may not act on any of the header fields in a message before dispatching it to a type-specific handling routine (though these routines will likely call shared code to handle them). The sequence of class identifiers that make up a message type identifier represents a class hierarchy of message types. A node that receives a message of a type it does not fully understand should attempt nevertheless to treat it as a message of the most specific superclass it does understand. For example, if a node receives a message of (theoretical) type Request.Insert.ContentHash.Smith1 and it does not have specific code to handle that message type, it should behave as if the message were of type Request.Insert.ContentHash if it understands that, or even as type Request.Insert if that is all it knows about. Some conceivable message types like Request are not valid by themselves, even though they have valid subclasses. For historical reasons, some message types such as Reply.Data have aliases like DataReply. It is important that nodes treat these identically, even if subclassed. For example, a node receiving a message of type DataReply.Encrypted must treat it exactly as it would a message of type Reply.Data.Encrypted. It is important when extending the protocol with a new message subtype to do so in such a way that the behavior of older nodes still makes sense (even though it may not take advantage of features for which the new subclass was made). Completely unknown message types (those for which the node does not recognize any supertype) should generate error replies (messages of type Error.Unsupported) to advise the sending node that it has not been understood (how to send such a reply will be detailed shortly). Header Fields Header fields generally contain data relevant to the processing of a message by nodes on the network (such as routing information), or information about the document in the payload of a message (such as its content type). While header fields are interpreted in the context of the message type in which they are found, it is expected that the message handling routines will call shared routines to handle them in most cases (possibly doing some message-specific processing before or after calling the shared routine). Header field names represent a class hierarchy similar to that for message types. Nodes should attempt to properly handle header fields that have been subclassed from those they expect to find in messages. For example, if a node (or more specifically, the message-handling code for a specific message type) encounters the header field line SearchKey.Signed.DSA1=F857A709BB5310A1, and it does not have code to specifically address that field type, then it should fall back to treating the field as if it were of type SearchKey.Signed, or as a last resort type SearchKey. It is an error for a type-matching ambiguity to exist. If some function of a node expects to find a specific header field like Storable.PublicKey, there must be only one header field with that type or any subtype present. The constraint on duplicate header fields helps ensure this, but is not by itself sufficient; care must be taken in defining expected header fields so that such ambiguities don't occur. It will not cause ambiguities, for example, for that field to be present only as the subtype Storable.PublicKey.RSA2, and for an expected Storable.Expires to also be present. (Storable by itself is not an expected header field and cannot be for that reason). But it would be an error for there to be both a Storable.PublicKey.RSA2 and a Storable.PublicKey.RSA1 when any function of a node expected to find Storable.PublicKey. Each message type will specify how it handles header fields it does not recognize at all, even at the highest superclass level. Sometimes they are quietly ignored or passed on, and other times they may generate errors. But in no case may a node refuse to recognize a header field that is unique and a subclass of one it does recognize and expect. If a cetain node function would normally store the name and value of a header field for later retrieval, it must similarly store the full name and value of a subclassed field and retrieve its full name. For example, if the value of header field Storable.Expires is usually stored along with a document in a node's data store, then the value of the field Storable.Expires.WithSelfDestruct must be stored the same way, and the full field type must be remembered so that when it is later retrieved it is reconstructed as the original Storable.Expires.WithSelfDestruct. The octets making up a header field value must be valid UTF-8, representing Unicode code points in the range 0x0020 to 0xFFFF. Any function of a node that requires information not easily representable by that encoding must specify the method by which the information is encoded into acceptable UTF-8. The code that parses the message header must be able to put the header field value into a Unicode string, after which the field handling code may interpret that Unicode as it needs. Subclasses of fields that use such non-standard encodings should not change the encoding they use in any way that would generate errors in code that only understands the superclass. UniqueID This field serves to uniquely identify groups of related messages; for example, requests for data and replies fulfilling them. It is an ASCII hexadecimal representation of a 64-bit integer chosen randomly by the first message's creator. Nodes generally copy this value from a request message into any replies or forwards they compose. HopsToLive A 64-bit non-negative integer that is decremented every time a message is forwarded from one node to another. It is used to prevent messages being forwarded from node to node indefinitely, and also to determine when a search for information should be given up as unsuccessful. Nodes forwarding a message will decrement this value before passing the message on to the next node. The initial value will be chosen based upon the type of message and the size and topology of the network. The specification will contain recommendations for initial values in the description of each message type. Newly composed messages in reply to a received message will either generate their own HopsToLive value, inherit the decremented value of the message to which they are replying, or inherit the non-decremented value. Extreme care should be taken in this last case that this is only done when explicitly required by the specification of the reply's message type. The specification must take care that it never becomes possible to do this in such a way as to create an infinite loop of forwarded messages. An alternate method for decrementing the value is as follows: If a node receives a message with a HopsToLive count of 2 or more, it must subtract 1 from that value. If it receives a message with the value of 1, it must decrement the value to 0 with a probability of 60%, and leave it at 1 with a probability of 40%. If it decrements the value to 0, that action triggers the node to compose a response indicating the exhaustion of the hop count (as specified by the type of message). If it chooses to leave the value at 1, it should behave exactly as if the message were received with a count of 2 and then decremented. It is an error to receive or transmit a message with a HopsToLive of 0. The random-decrement above is intended to make it difficult for a message-sender to pinpoint the exact moment at which a message will exhaust its hop count, which in turn makes it more difficult to pinpoint where a document is stored. It may add a few extra hops to any forwarded request, but few enough that it should not overly affect performance. Depth Depth is used to track how far away from its origin a message has travelled. It is in some ways the reverse of HopsToLive, in that it is a 64-bit non-negative integer that begins at 1 and is incremented each time a message is forwarded. The primary difference is in how loopbacks are treated. When a request being forwarded around the net reaches a node that is has already visited, that node sends a reply advising the immediately previous node of this condition, and that node chooses somewhere else to forward the original request. At each of these two hops, HopsToLive is modified, but Depth is not. For reasons similar to those metioned in the description of HopsToLive, a similar randomizing method can be used when incrementing the value. Specifically, if a node receives a Depth value of 1, it should increment it to 2 with a probability of 60%, and leave it at 1 with a probability of 40%. If it receives a value of 2 or more, it must always increment. When this method is used the final depth may be a few hops too short, so if this value is used to estimate the length of a return trip, this must be accommodated. Source A string indicating the delivery method and immediate source of the message (not necessarily the original source). When forwarding or composing a reply to a request message, nodes will generally put their own address into this field. A client program creating a request message will fill in the address and method by which it wishes to receive replies. The delivery method is identified by the part of the value before the first forward slash character (see the Message Delivery Methods section of this document below). After the slash character, the remainder of the string specifies information needed to identify a node using that delivery method. Data Store Nodes that store data (and it is strongly recommended that all nodes do so if they can) must maintain a store of documents and associated data necessary to provide that data in response to requests. With each document, the node stores the address of the node from which that document was received, its key, and any metadata it feels would be appropriate to store. A node cannot, and is not expected to, retain every document ever stored in it, so it must have some means of retiring documents. A suggested method is for the data store to retain with each document information about its pattern of access, so it can give preference to documents that are frequently requested. User-specified metadata like expiration dates can also be used. When documents are retired from the data store, a node may still retain a record of its key and source address to facilitate later requests for it or for other documents. Pending Message Store Because much of the message traffic on a Freenet network is in the form of requests and replies, a node must maintain a store of its pending requests so that it can associate them with arriving replies. This store is indexed by the UniqueID header field value of the request message, and contains whatever data the node feels is necessary to properly deal with incoming replies (as determined by the nature of the particular request). Composing Messages Messages are composed either by nodes (in reply to other messages, or in response to external conditions) or by client programs wishing to communicate with nodes. Composing a message requires knowing the purpose of the message being composed to determine the message type identifer, and the details of what that message type requires so that header fields (and other parts of the message) can be filled in. Header fields in a message are interpreted in the context of the message type; that is, a received message will be dispatched to a handler based only on its type, then that handler will interpret any header fields it needs, possibly calling shared code that encapsulates behviors common to more than one message type. For this reason, it is necessary to be familiar with the specific message type in order to correctly fill in all the fields when composing a new message. Code that composes messages should therefore also be organized by message type, calling shared routines where appropriate for commonly-shared features of many message types. As with message-receiving code, message-composing code may call shared routines that know about more subclasses of a header field than the message-composing code knows about. For example, if code to create messages of type Request.Data calls a routine to compose an appropriate header field of type Source, that routine may choose to create a header field of type Source.URI instead. Replying to Messages Many times, a node receiving a message will compose a new message in reply. The general rules that apply to this situation (but which may be overridden by specific message types) are the following: The new message inherits the UniqueID value of the message to which it is replying. The HopsToLive field is either inherited from the message being replied to, or else a new value is generated, depending on the message type. Except as noted above, the reply is a newly composed message in its own right, and should not copy any of the header field values from the message being replied to unless the message type specifically requires it. In particular, the Source field is given the value of the current node's address and transport method, and header fields understood by the lower-level transport method (those of class TransportOption and its subclasses) are given values appropriate for the new connection. The reply is sent to the node specified in the Source header field, using the transport mechanism specified there, unless some lower-level message transport option specifically overrides this behavior. Forwarding Messages When a message is forwarded from one node to another, the Source field is replaced with the present node's address and transport method as usual, and all other header fields in the message are passed on exactly as they were received, with a few exceptions: The HopsToLive value is decremented as specified above. If this decrements the value to 0, the message must not be forwarded, but rather a reply must be generated to indicate the this condition as specified by the message type. If, in the judgment of a receiving node, the value of HopsToLive it receives is unreasonably large for the nature of the network, a node is free to reduce it to a more appropriate value. The Depth value is incremented as specified above. Header fields understood by the lower-level transport method are changed to reflect the new connection (for example, TransportOptions.Keepalive should reflect the connection between the forwarding node and the new destination, regardless of the status of the connection which sent the original message being forwarded. Any header fields of type Transient or its subclasses should be removed and not forwarded. All other header fields must be forwarded as they are found. Behavior of Specific Message Types This section describes in detail the purpose of each message type, the expected behavior of a node sedning the message, and the expected behavior of a node receiving the message. <literal>Request.Handshake</literal> (formerly <literal>HandshakeRequest</literal>) This message is initiated by a node immediately after connecting to one of its neighbors, and is used to verify that the node is active, to query its protocol version, and to set up transport options for later communications over the same channel. Sending The node initiating this message needs to fill in the following header fields as specified: UniqueID Initialize with a newly-generated random 64-bit integer, encoded in hexadecimal. Since this message is never forwarded, and thus always identifies its originator, it is not important for this value to be a cryptographically secure random value. Implementations may wish to optimize by using a cheaply-computed pseudo-random number here and save the expense of secure random number generation for more security-critical message types such as document requests and inserts. Source Initialize with a string encoding the transport method and address of the sending node, as specified in the Message Transport Methods section of this specification. HopsToLive Initialize with the value 1. This message will never be forwarded beyond the first receiving node. Depth Initialize with the value 1. It is expected that the node to which this message is sent will reply immediately, so the sending node should take advantage of this by using TransportOption.Keepalive=true or similar features of the transport method (see Message Transport Methods section below) to optimize turnaround time. Receiving The node receiving this message should reply by composing a Reply.Handshake message inheriting the UniqueID of the request. <literal>Reply.Handshake</literal> (formerly <literal>HandshakeReply</literal>) This message is sent in response to a Request.Handshake message, but it can also be sent unrequested as a way for a node to announce its presence to other nodes. Sending When sent in reply to a request, the message must inherit the UniqueID of the request. If sent unbidden, a random ID is created by the initiating node (and is still required so that receiving nodes will not confuse the message with a reply to one of their pending requests). Source is specified as usual. As with Request.Handshake, HopsToLive and Depth are initialized to 1 (this message is never forwarded). Finally, the header field Version must be present and filled in with a value to identify the protocol version understood by the sending node. Currently, the only accepted value for this field is Freenet 1.0. Receiving A node receiving this message unbidden (that is, not in response to an earlier Request.Handshake) is not required to perform any action. It may, at its discretion, choose to record the value of the Source header field as the address of a new node to which it may later send requests. A node receiving this message in response to its earlier request may use the version information to tailor the content of messages it sends to the node that sent the Reply.Handshake, but it should never refuse to communicate with a node because of a version difference. Nodes should trust the subclassing mechanisms to ensure compatibility between different versions of the protocol. <literal>Request.Data</literal> (formerly <literal>DataRequest</literal>) This message is initiated by a user-level client program node to request the document associated with a particular key. Sending The client program originating this message must ensure that it contains the following header fields: UniqueID A random 64-bit value. This serves to uniquely identify all messages associated with this original request, while not revealing any information about the origin of the request. The efficacy of this anonymity is somewhat dependent on the quality of the random number generation algorithm used, so it is a good idea to use a high-quality random number generator. HopsToLive Initialize with an integer sufficiently large to ensure a thorough search of the network, but not so large that search failures take more time than the user can tolerate. For a small test network, this may be in the range of 20-30 hops. For the internet as a whole, something in the 100-200 range might be more appropriate. To improve anonymity by making it more difficult to track the source of a request, this value should also be chosen randomly within the appropriate range. Depth Initialize to 1. Source Initialize with the address and tranport method by which you wish to receive replies. SearchKey Initialize with the key value associated with the desired document. The client program then sends this message to any node on the network (typically a local node running on the same machine as the client) and waits for replies. Receiving A node receiving this message must take the following actions, in the order listed: If the pending message store shows that a Request.Data message with the current UniqeID has already been seen at this node (indicating a loop in the message's journey), the node must compose a Request.Continue message in reply inheriting the UniqueID, HopsToLive, and Depth of the present request. The composed Request.Continuemessage does not inherit or pass on any other header fields from the request, and places its own address into the Source field as usual. The node retains its record of the pending request. If the local data store contains a document associated with the key specified in the request, the node must compose a Reply.Data message inheriting the UniqueID of the request as described below under that message type. The request is now complete for this node, so it can remove its record of the pending request. If the document was not found locally, the value of HopsToLive is decremented (as described in the previous section). If the result is now 0, a TimedOut message is returned to the source of the request inheriting the ID of the request, and with a new HopsToLive calculated from Depth in the same way as the for the Reply.Data. The request is now complete for this node, so it can remove its record of the pending request. Finally, if the decremented HopsToLive is not 0, the node must decide which of its neighboring nodes is most likely to have a document associated with the specified key, and forward the request to that node. It does this by finding the closest key in its local data store by the closeness measure described in the later Key Closeness section and forwards the request to the node from which that key came. The node must retain in its pending message store the fact that it has forwarded this request, the address from which it received the request (from the Source field of the request message it received), and the current value of the Depth field of the request. The usual procedure for forwarding a message is followed: HopsToLive is decremented, Depth is incremented, Source is filled in with the present node's address, and UniqueID and all other header fields are passed through except those used by the low-level transport method and those of type Transient or a subclass thereof. <literal>Send.Data</literal> (formerly <literal>DataReply</literal>) This message is originated by a node in reply to a Request.Data when the document associated with the key requested by that message is present in the local data store. Sending The message composed inherits the UniqueID of the request, and generates a new HopsToLive value appropriate for traversing a path back to the originating client. This will typically be the Depth of the request plus a small random value for security. Source is filled in with the address of the present node as usual, and Depth is set to 1. Other header fields may be filled in with metadata that was stored with the document for that purpose, with the same names and values they had in the message that inserted the document into the network. The trailing field of the newly composed message must contain the document itself. Receiving A node may receive a message of this type for one of two reasons. If it originated the Request.Data message of the same ID, then that original request is now satisfied and can be presented to the user or disposed of in such other manner as it had intended. Otherwise, it must have earlier forwarded a Request.Data message on some other node's behalf. In this case it has a record in its pending request store of where that earlier request came from. It must forward the reply (following the usual forwarding rules) to that node, and optionally store the document locally if it desires. Because the document itself comes at the end of the reply message, it is possible for a node to know that it must forward the message and its included document before it has finished receiving the latter. In this case, it is permissible for the node to compose the header of its forwarded message and begin sending that, followed by the document itself in the trailing field, even as it continues to receive the document. This reduces the latency of sending large documents through many nodes. Each node must, though, read the entire header section of the message before it begins this "tunnelling" forwarding function, because the header may contain field values that affect how the message will be handled. <literal>Request.Insert</literal> (formerly <literal>InsertRequest</literal>) (TODO) Sending Receiving <literal>Reply.Insert</literal> (formerly <literal>InsertReply</literal>) (TODO) Sending Receiving <literal>Send.Insert</literal> (formerly <literal>DataInsert</literal>) (TODO) Sending Receiving <literal>Request.Continue</literal> (formerly <literal>RequestFailed</literal>) (TODO) Sending Receiving <literal>Reply.NotFound</literal> (formerly <literal>TimedOut</literal>) (TODO) Sending Receiving <literal>Reply.Restart</literal> (formerly <literal>QueryRestarted</literal>) (TODO) Sending Receiving <literal>Error</literal> Messages of this type (and its subtypes) indicate abnormal error conditions. These fall outside the expected automated behavior of nodes, and it is expected that nodes will simply log these for later inspection by human operators. Sending If creating an Error message in response to an earlier received message, the Error message must inherit the UniqueID of that earlier message, as well as whatever other header fields may be useful for clarifying the error. For example, a node receiving a Request.Data message with a badly formed key may wish to reply with an Error inheriting the ID of that request, and including the bad key as well. Error messages not sent in response to an earlier message can be in any format the sender thinks will provide sufficient information for diagnosis. Receiving The only behavior expected upon receiving an Error message is to log it for later inspection. Client programs receiving these may also alert the user. <literal>Control</literal> Messages of this type and its subtypes can be used by node operators to control the behavior of running Freenet nodes and to query their status. These messages must be authenticated. (TODO) Sending Receiving Message Delivery Methods Node Addresses In all places that serve to identify a node to which messages can be sent (such as the Source header field of messages and a node's internal list of other nodes on the network), the node is identified by a string that specifies a transport method and an address. This string must begin with an identifier (an ASCII alphabetical charater, followed by zero or more ASCII alphanumerics) that selects a transport method. This identifier is then followed immediately by a / character, and the address of the node. This address must be interpreted as determined by the specific transport method. For example, the string tcp/195.34.172.12:3972 specifies transport method tcp, and that transport method specifies that addresses are composed of an IP address or domain name, followed by : and a port number. Transport Options Messages sent between nodes may contain information in their header fields that is used by the transport method, and not otherwise needed by the message to serve its purposes. These must use the field name TransportOption or subclasses thereof. Such header fields may be inserted by a node delivering a message to be interpreted by the receiving node, which should remove them from the message before further processing. Specific Transport Methods TCP Node addresses beginning with the identifier tcp/ specify the use of TCP/IP to deliver messages. After the slash, the node is identified by a host IP address (or domain name) and port number. The port number is the single integer after the last colon character, while the host is identified by the characters in between the first slash and the last colon (this is intended to accommodate IPv6). A Source header field value for this delivery method would therefore look like:
tcp/192.161.95.2:1000
or
tcp/piclab.com:11000
Note that the host may be specified either numerically or by DNS name, and that the port number is not optional as it is with some other protocols. There is no standard port number for Freenet. Header fields subclassed from TransportOption can be used to specify further details of the connection. In particular, TransportOption.Keepalive=true can be used to tell the receiving node to leave open the connection from which it receives a message and to send replies over the same connection. This is a convenient optimization when more than one message needs to be sent to the same node, or when replies are expected quickly.
Freenet Keys Freenet documents are associated with keys that identify them to users. These keys are arbitrary strings of Unicode characters in the range 0x0020 to 0xFFFF. Key Hashing (TODO) Key Closeness (TODO) Common Client Bahaviors Sections of this specification prior to this one are sufficient to specify Freenet protocol and build a working network of nodes. There are, however, additional functions that clients of the network will commonly perform, and that therefore may benefit from standardization. Some of these will be described here. Nothing in this section should be considered normative for Freenet itself, merely recommended ways to perform common useful functions of Freenet clients. Metadata Many clients will want to keep together with a document, some information about that document that isn't well-suited for the limited Storable header fields, either because it is too large, structured too complexly, too sensitive to pass in plain text, or just because it is human-readable information that a node cannot make any use of. A recommended way of keeping such data is this: Make the payload by concatenating the metadata and the document it describes (metadata first), and include in the headers a Storable.InfoLength whose value is the size of the prepended metadata. Client programs must then separate the two upon receipt. URI Syntax Clients that access data in Freenet from the Web (gateways, mixed hypertext) will need to have some method of specifying a Freenet key with a URI. A recommended way to do this is the following: Appendix A: Sample Client/Server Dialogs This is Oskar's overview. I haven't quite figured out how to present this yet, but I've included it in the document because I know I want it or something like it included somewhere. First you send a HandshakeRequest (or you don't have to, but you can), syntax: HandshakeRequest UniqueID=64 bit hex Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port EndMessage and get back a: HandShakeReply UniqueID=64 bit hex Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port Version=Freenet version number EndMessage To request data you send a: DataRequest UniqueID=64 bit hex Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port SearchKey=160 bit SHA1 hash, 5 sets of 8 uppercase characters seperated by space and get back a DataReply UniqueID=64 bit hex (same as request) Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port DataSource="tcp/"address:port Data binary data or maybe a: TimedOut UniqueID=64 bit hex (same as request) Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port or even a (though this will probably only happen on test or broken networks): RequestFailed UniqueID=64 bit hex (same as request) Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port To make an insert you send an: InsertRequest UniqueID=64 bit hex Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port SearchKey=160 bit SHA1 hash, 5 sets of 8 uppercase characters seperated by space and unless you get a (which should only happen on test networks): RequestFailed UniqueID=64 bit hex (same as request) Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port you might get a: DataReply UniqueID=64 bit hex (same as request) Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port DataSource="tcp/"address:port DataLength=max 64 bit decimal Data binary data which means there was already data for this key, so it is replying with that data. Or possibly a: TimedOut UniqueID=64 bit hex (same as request) Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port which means references to old data with the same key was found, but the data was not (still can't insert). But most likely you will get an: InsertReply UniqueID=64 bit hex Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port in which case you can go ahead and send a: DataInsert UniqueID=64 bit hex (same as request) Depth=max 64 bit decimal HopsToLive=max 64 bit decimal Source="tcp/"address:port DataSource="tcp/"address:port DataLength=max 64 bit decimal Data binary data That pretty much sums it up. The order of the fields don't matter, but the trailing field ("Data") has to be last, followed by the binary data. Clients currently lie about the DataSource on DataInsert's and set it to the address of the node they insert to. Brandon added the "EndMessage" field to the handshakes, and I believe he wants them to go on all messages sooner or later. References A Distributed Decentralised Information Storage and Retrieval System Ian Clarke's original paper describing the adaptive network. Freenet Home Home page of the Freenet Project, an open source software project whose primary goal is to build a network based on this protocol.