Request for Comments: 1832 August 1995 (summarized by Juan A. Ternero) XDR: External Data Representation Standard 1. INTRODUCTION XDR is a standard for the description and encoding of data. XDR uses a language to describe data formats similar to the C language. 2. BASIC BLOCK SIZE The representation of all items requires a multiple of four bytes of data. If needed, (0 to 3) residual zero bytes are added. +--------+--------+...+--------+--------+...+--------+ | byte 0 | byte 1 |...|byte n-1| 0 |...| 0 | BLOCK +--------+--------+...+--------+--------+...+--------+ |<-----------n bytes---------->|<------r bytes------>| |<-----------n+r (where (n+r) mod 4 = 0)>----------->| 3. XDR DATA TYPES General paradigm declaration: - angle brackets (< and >) denote variablelength sequences of data - square brackets ([ and ]) denote fixed-length sequences of data 3.1 Integer 32-bit datum in the range [-2147483648,2147483647]. Declaration: int identifier; Representation: two's complement notation (MSB) (LSB) +-------+-------+-------+-------+ |byte 0 |byte 1 |byte 2 |byte 3 | INTEGER +-------+-------+-------+-------+ <------------32 bits------------> 3.2. Unsigned Integer 32-bit datum in the range [0,4294967295]. Declaration: unsigned int identifier; Representation: (MSB) (LSB) +-------+-------+-------+-------+ |byte 0 |byte 1 |byte 2 |byte 3 | UNSIGNED INTEGER +-------+-------+-------+-------+ <------------32 bits------------> 3.3 Enumeration Handy for describing subsets of the integers. Declaration: enum { name-identifier = constant, ... } identifier; Representation: Same representation as signed integers. Example: enum { RED = 2, YELLOW = 3, BLUE = 5 } colors; 3.4 Boolean Declaration: bool identifier; This is equivalent to: enum { FALSE = 0, TRUE = 1 } identifier; 3.5 Hyper Integer and Unsigned Hyper Integer 64-bit (8-byte) numbers. Declarations: hyper identifier; unsigned hyper identifier; Representations: Obvious extensions of integer and unsigned integer defined above. (MSB) (LSB) +-------+-------+-------+-------+-------+-------+-------+-------+ |byte 0 |byte 1 |byte 2 |byte 3 |byte 4 |byte 5 |byte 6 |byte 7 | +-------+-------+-------+-------+-------+-------+-------+-------+ <----------------------------64 bits----------------------------> HYPER INTEGER UNSIGNED HYPER INTEGER 3.6 Floating-point Floating-point data type "float" (32 bits). Declaration: float identifier; Representation: IEEE standard for normalized single-precision floating-point numbers. Three fields: S: sign. One bit. E: exponent. 8 bits. F: mantissa. 23 bits The floating-point number is described by: (-1)**S * 2**(E-127) * 1.F +-------+-------+-------+-------+ |byte 0 |byte 1 |byte 2 |byte 3 | SINGLE-PRECISION S| E | F | FLOATING-POINT NUMBER +-------+-------+-------+-------+ 1|<- 8 ->|<-------23 bits------>| <------------32 bits------------> 3.7 Double-precision Floating-point Double-precision floating-point data type "double" (64 bits). Declaration: double identifier; Representation: One form of IEEE standard for normalized double-precision floating-point numbers. Three fields: S: sign. One bit. E: exponent. 11 bits. F: mantissa. 52 bits The floating-point number is described by: (-1)**S * 2**(E-1023) * 1.F +------+------+------+------+------+------+------+------+ |byte 0|byte 1|byte 2|byte 3|byte 4|byte 5|byte 6|byte 7| S| E | F | +------+------+------+------+------+------+------+------+ 1|<--11-->|<-----------------52 bits------------------->| <-----------------------64 bits-------------------------> DOUBLE-PRECISION FLOATING-POINT 3.8 Quadruple-precision Floating-point Quadruple-precision floating-point data type "quadruple" (128 bits). Declaration: quadruple identifier; Representation: IEEE standard for normalized double extended precision floating-point numbers. Three fields: S: sign. One bit. E: exponent. 15 bits. F: mantissa. 112 bits The floating-point number is described by: (-1)**S * 2**(E-16383) * 1.F +------+------+------+------+------+------+-...--+------+ |byte 0|byte 1|byte 2|byte 3|byte 4|byte 5| ... |byte15| S| E | F | +------+------+------+------+------+------+-...--+------+ 1|<----15---->|<-------------112 bits------------------>| <-----------------------128 bits------------------------> QUADRUPLE-PRECISION FLOATING-POINT 3.9 Fixed-length Opaque Data Fixed-length of n (static) bytes of uninterpreted data. Declaration: opaque identifier[n]; Representation: 0 1 ... +--------+--------+...+--------+--------+...+--------+ | byte 0 | byte 1 |...|byte n-1| 0 |...| 0 | +--------+--------+...+--------+--------+...+--------+ |<-----------n bytes---------->|<------r bytes------>| |<-----------n+r (where (n+r) mod 4 = 0)------------>| FIXED-LENGTH OPAQUE 3.10 Variable-length Opaque Data Variable-length (counted) opaque data. Declaration: opaque identifier; or opaque identifier<>; The constant m denotes an upper bound of the number of bytes that the sequence may contain. If m is not specified, as in the second declaration, it is assumed to be (2**32) - 1, the maximum length. Example: opaque filedata<8192>; Representation: 0 1 2 3 4 5 ... +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ | length n |byte0|byte1|...| n-1 | 0 |...| 0 | +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ |<-------4 bytes------->|<------n bytes------>|<---r bytes--->| |<----n+r (where (n+r) mod 4 = 0)---->| VARIABLE-LENGTH OPAQUE 3.11 String String of n (numbered 0 through n-1) ASCII bytes. Declaration: string object; or string object<>; The constant m denotes an upper bound of the number of bytes that a string may contain. If m is not specified, as in the second declaration, it is assumed to be (2**32) - 1, the maximum length. Example: string filename<255>; Representation: 0 1 2 3 4 5 ... +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ | length n |byte0|byte1|...| n-1 | 0 |...| 0 | +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ |<-------4 bytes------->|<------n bytes------>|<---r bytes--->| |<----n+r (where (n+r) mod 4 = 0)---->| STRING 3.12 Fixed-length Array Fixed-length arrays of homogeneous elements. Though all elements are of the same type, the elements may have different sizes. Declaration: type-name identifier[n]; Representation: +---+---+---+---+---+---+---+---+...+---+---+---+---+ | element 0 | element 1 |...| element n-1 | +---+---+---+---+---+---+---+---+...+---+---+---+---+ |<--------------------n elements------------------->| FIXED-LENGTH ARRAY 3.13 Variable-length Array Variable-length arrays of counted homogeneous elements. Declaration: type-name identifier; or type-name identifier<>; The constant m specifies the maximum acceptable element count of an array; if m is not specified, as in the second declaration, it is assumed to be (2**32) - 1. Representation: 0 1 2 3 +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+ | n | element 0 | element 1 |...|element n-1| +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+ |<-4 bytes->|<--------------n elements------------->| COUNTED ARRAY 3.14 Structure Structures of different types of data. Declaration: struct { component-declaration-A; component-declaration-B; ... } identifier; The components of the structure are encoded in the order of their declaration in the structure. Each component's size is a multiple of four bytes, though the components may be different sizes. Representation: Same order of their declaration. +-------------+-------------+... | component A | component B |... STRUCTURE +-------------+-------------+... 3.15 Discriminated Union Type composed of: - a discriminant - ONE type selected from a set of prearranged types The type of discriminant must be: - integer type ("int" or "unsigned int") - enumerated type (including "bool") The component types are called "arms" of the union, and are preceded by the value of the discriminant which implies their encoding. Declaration: union switch (discriminant-declaration) { case discriminant-value-A: arm-declaration-A; case discriminant-value-B: arm-declaration-B; ... default: default-declaration; } identifier; Each "case" keyword is followed by a legal value of the discriminant. The default arm is optional. Representation: 0 1 2 3 +---+---+---+---+---+---+---+---+ | discriminant | implied arm | DISCRIMINATED UNION +---+---+---+---+---+---+---+---+ |<---4 bytes--->| 3.16 Void 0-byte quantity. Voids are useful in unions, where some arms may contain data and others do not. Declaration: void; Representation: ++ || VOID ++ --><-- 0 bytes 3.17 Constant The symbolic constant is used to define a symbolic name for a constant; it does not declare any data. It may be used anywhere a regular constant may be used. Declaration: const name-identifier = n; Representation: There is no representation because it does not declare any data. Example: const DOZEN = 12; 3.18 Typedef It serves to define new identifiers for declaring data. It is similar as described in C language. Declaration: typedef declaration; Representation: There is no representation because it does not declare any data. Example 1: typedef float real; real v1; float v2; /* same type as v1 */ Example 2: typedef egg eggbox[DOZEN]; eggbox fresheggs1; egg fresheggs2[DOZEN]; /* same type as fresheggs1 */ When a typedef involves a enum definition, is equivalent (and preferred) to remove "typedef" and place the identifier after the "enum" keyword. For example, here are the two ways to define the type "bool": typedef enum { /* using typedef */ FALSE = 0, TRUE = 1 } bool; enum bool { /* preferred alternative */ FALSE = 0, TRUE = 1 }; The same applies to "struct" and "union". 3.19 Optional-data It is one kind of union. It is very useful for describing recursive data-structures such as linked-lists and trees. Declaration: type-name *identifier; This is equivalent to the following union: union switch (bool opted) { case TRUE: type-name element; case FALSE: void; } identifier; It is also equivalent to the following variable-length array: type-name identifier<1>; For example, the following defines a type "stringlist" that encodes lists of arbitrary length strings: struct *stringlist { string item<>; stringlist next; }; It could have been equivalently declared as the following union: union stringlist switch (bool opted) { case TRUE: struct { string item<>; stringlist next; } element; case FALSE: void; }; or as a variable-length array: struct stringlist<1> { string item<>; stringlist next; }; Both of these declarations obscure the intention of the stringlist type, so the optional-data declaration is preferred over both of them. The optional-data type also has a close correlation to how recursive data structures are represented in high-level languages such as Pascal or C by use of pointers. In fact, the syntax is the same as that of the C language for pointers.