Lehrstuhl fuer Rechnerorientierte Statistik und Datenanalyse Uni Augsburg  
Institut Forschung Lehre Mitarbeiter News Allgemeines Research Staff Software
Rserve

About Rserve
Documentation
Download
FAQ
Examples
Developer

printable form

Development of the Rserve
Any interested developers are welcome to improve Rserve since it is released under GPL. You can download the sources from the download section. As you can easily see from the directory structure I hold the sources in a CVS repository, therefore anyone who wants to contribute to the development should send me an e-mail (Simon.Urbanek@r-project.org) in order to obtain access to the CVS.
 
Technical Documentation for Rserve
This document describes the protocols and structures used by Rserve (version 0.1-9). This information is helpful for implementing Rserve clients.

Rserve communication is performed over any reliable connection-oriented protocol (usually TCP/IP; Rserve 0.1-9 supports TCP/IP and local unix sockets). After connection is established, the server sends 32 bytes representing the ID-string defining the capabilities of the server. Each attribute of the ID-string is 4 bytes long and is meant to be user- readable (i.e. use no special characters), and it's a good idea to make "\r\n\r\n" the last attribute.

the ID string must be of the form:

   [0] "Rsrv" - R-server ID signature
   [4] "0100" - version of the R server
   [8] "QAP1" - protocol used for communication (here Quad Attributes Packets v1)
   [12] any additional attributes follow. \r\n and '-' are ignored.
optional attributes (in any order; it is legitimate to put dummy attributes, like "----" or " " between attributes):
   "R151" - version of R (here 1.5.1)
   "ARpt" - authorization required (here "pt"=plain text, "uc"=unix crypt)
            connection will be closed
            if the first packet is not CMD_login.
	    if more AR.. methods are specified, then client is free to
	    use the one he supports (usually the most secure)
   "K***" - key if encoded authentification is challenged (*** is the key)
            for unix crypt the first two letters of the key are the salt
	    required by the server */
The protocol specified in the third attribute (here QAP1) is used immediately after the ID string was transmitted.

QAP1 message oriented protocol

QAP1 (quad attributes protocol v1) is a message oriented protocol, i.e. the initiating side (here the client) sends a message and awaits a response. The message contains both the action to be taken and any necessary data. The response contains a response code and any associated data. Every message consists of a header and data part (which can be empty). The header is structured as follows:

  [0]  (int) command
  [4]  (int) length of the message-16
  [8]  (int) offset of the data part
  [12] (int) reserved (must be 0)
command specifies the request or response type.
length specifies the number of bytes belonging to this message after the header.
offset specifies the offset of the data part, where 0 means directly after the header (which is normally the case)
res reserved for future use

The header must always be transmitted en-block. Data part can be split into packets of an arbitrary size. Each message consists of 16 bytes (the header) plus data. Therefore a message consists of length+16 bytes.

The data part contains any additional parameters that are send along with the command. Each attribute consists of 4-byte header:

  [0]  (byte) type
  [1]  (24-bit int) length
Types used by the current Rserve implementation (for list of all supported types see Rsrv.h):
  • DT_INT (4 bytes) integer
  • DT_STRING (n bytes) null terminated string
  • DT_BYTESTREAM (n bytes) any binary data
  • DT_SEXP R's encoded SEXP, see below
all int and double entries throughout the transfer are encoded in Intel-endianess format:
int=0x12345678 -> char[4]=(0x78,0x56,x34,0x12) functions/macros for converting from native to protocol format are available in Rsrv.h.

Commands supported by Rserve

Supported commands:

    command           parameters    | response data

    CMD_login         DT_STRING     | -
    CMD_voidEval      DT_STRING     | -
    CMD_eval          DT_STRING     | DT_SEXP
    CMD_shutdown      [DT_STRING]   | -
    CMD_openFile      DT_STRING     | -
    CMD_createFile    DT_STRING     | -
    CMD_closeFile     -             | -
    CMD_readFile      [DT_INT]      | DT_BYTESTREAM
    CMD_writeFile     DT_BYTESTREAM | -
    CMD_removeFile    DT_STRING     | -
    CMD_setSEXP       DT_STRING,    | -
                      DT_SEXP
    CMD_assignSEXP    DT_STRING,    | -
                      DT_SEXP
    CMD_setBufferSize DT_INT        | -
  
(Parameters in brackets [] are optional)

Responses:
The CMD_RESP mask is set for all responses. Each response consists of the response command (RESP_OK or RESP_ERR - least significant 24 bit) and the status code (most significant 8 bits). For a list of all currently supported status codes see ERR_... in Rsrv.h.

Encoding of SEXP R expression

R SEXP value (DT_SEXP) are recursively encoded in a similar way as the parameter attributes. Each SEXP consists of a 4-byte header and the actual contents. The header is of the form:

  [0]  (byte) eXpression Type
  [1]  (24-bit int) length
The expression type consists of the actual type (least significant 6 bits) and attributes. Follwing expression types are supported:
XT_NULL          data: - 
XT_INT           data: (4) int 
XT_DOUBLE        data: (8) double 
XT_STR           data: (n) char null-term. strg. 
XT_LANG          data: same as XT_LIST 
XT_SYM           data: (n) char symbol name 
XT_BOOL          data: (1) byte boolean
			      (1=TRUE, 0=FALSE, 2=NA) 
XT_VECTOR        data: (n*?) SEXP 
XT_LIST          data: SEXP head, SEXP vals, [SEXP tag]
XT_CLOS          data: SEXP formals, SEXP body

XT_ARRAY_INT     data: (n*4) int,int,.. 
XT_ARRAY_DOUBLE  data: (n*8) double,double,.. 
XT_ARRAY_STR     data: (?) string,string,.. 
XT_ARRAY_BOOL    data: (n) byte,byte,.. 

XT_UNKNOWN       data: (4) int - SEXP type as defined in R
Attributes:
XT_HAS_ATTR - if this flag is set then the SEXP has an attribute which is stored before the actual expression. In this case the layout looks as follows:
  [0]   (4) header SEXP: len=4+m+n, XT_HAS_ATTR is set
  [4]   (4) header attribute SEXP: len=n
  [8]   (n) data attribute SEXP
  [8+n] (m) data SEXP

Additions in version 0.2
Since version 0.2-0 the ID string reports version 0101 because of a change that makes it partially incompatible with previous versions. Main change is the fact that Rserve reporting version 0100 incorrectly omitted DT_SEXP header from the response to CMD_eval commands. This means that clients should check the version reported by Rserve and provide fix (for 0100 you can assume that CMD_eval always returns contents of a SEXP even if no DT_SEXP header is sent). Rserve reporting 0101 responds consistently, i.e. the proper DT_SEXP header is sent.
Second change is the requirement to pad strings with zeros so the length of the parameter/content is divisible by 4. Depending on the platform used the server may respond with ERR_inv_par if the parameters are not correctly alligned. Rserve reporting 0101 will itself pad strings in such manner when sending responses to the client.

Update: 2003-09-18: The previous documentation incorrectly stated that the second entry of the 4 byte headers (response and attribute) was 12-bit int, whereas it is in fact a 24-bit int. This was corrected now.
 


Additions in version 0.3
Rserve version 0.3 reports ID string version 0102 because support for large data was added. Previous versions were limited by the 24-bit length of parameters and SEXPs. The 0.3 version enhances the protocol by adding special flag DT_LARGE to parameter types and XT_LARGE to eXpression types. If this flag is set then the header is 8 bytes long (instead of previously 4 bytes). The additional 4 bytes are used for the parameter/expression length leading to a total of 56-bit maximum length of an expression or parameter (that is 65536TB which should be sufficient). Any data smaller 0x800000 (8MB) must be still coded in the original 4-byte header format. Current Rserve sends only data larger 0xfffff0 (16MB-16) in the large data format. Clients are encouraged to use the same threshold, but it's not required by the protocol.


[Institut] [Forschung] [Lehre] [Mitarbeiter] [News] [Allgemeines] [Software]