rsync.5 (13279B)
- .\" $OpenBSD: rsync.5,v 1.14 2023/04/12 08:32:27 claudio Exp $
- .\"
- .\" Copyright (c) 2019 Kristaps Dzonsons <kristaps@bsd.lv>
- .\"
- .\" Permission to use, copy, modify, and distribute this software for any
- .\" purpose with or without fee is hereby granted, provided that the above
- .\" copyright notice and this permission notice appear in all copies.
- .\"
- .\" THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
- .\" WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
- .\" MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
- .\" ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
- .\" WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
- .\" ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
- .\" OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
- .\"
- .Dd $Mdocdate: April 12 2023 $
- .Dt RSYNC 5
- .Os
- .Sh NAME
- .Nm rsync
- .Nd rsync wire protocol
- .Sh DESCRIPTION
- The
- .Nm
- protocol described in this relates to the BSD-licensed
- .Xr openrsync 1 ,
- a re-implementation of the GPL-licensed reference utility
- .Xr rsync 1 .
- It is compatible with version 27 of the reference.
- .Pp
- In this document, the
- .Qq client process
- refers to the utility as run on the operator's local computer.
- The
- .Qq server process
- is run either on the local or remote computer, depending upon the
- command-line given file locations.
- .Pp
- There are a number of options in the protocol that are dictated by command-line
- flags.
- These will be noted as
- .Fl D
- for devices,
- .Fl g
- for group ids,
- .Fl l
- for links,
- .Fl n
- for dry-run,
- .Fl o
- for user ids,
- .Fl r
- for recursion,
- .Fl v
- for verbose, and
- .Fl -delete
- for deletion (before).
- .Ss Data types
- The binary protocol encodes all data in little-endian format.
- Integers are signed 32-bit, shorts are signed 16-bit, bytes are unsigned
- 8-bit.
- A long is variable-length.
- For values less than the maximum integer, the value is transmitted and
- read as a 32-bit integer.
- For values greater, the value is transmitted first as a maximum integer,
- then a 64-bit signed integer.
- .Pp
- There are three types of checksums: long (slow), short (fast), and
- whole-file.
- The fast checksum is a derivative of Adler-32.
- The slow checksum is MD4,
- made over the checksum seed first (serialised in little-endian format),
- then the data.
- The whole-file applies MD4 to the file first, then the checksum seed at
- the end (also serialised in little-endian format).
- .Ss Multiplexing
- Most
- .Nm
- transmissions are wrapped in a multiplexing envelope protocol.
- It is composed as follows:
- .Pp
- .Bl -enum -compact
- .It
- envelope header (4 bytes)
- .It
- envelope payload (arbitrary length)
- .El
- .Pp
- The first byte of the envelope header consists of a tag.
- If the tag is 7, the payload is normal data.
- Otherwise, the payload is out-of-band server messages.
- If the tag is 1, it is an error on the sender's part and must trigger an
- exit.
- This limits message payloads to 24 bit integer size,
- .Li 0x00ffffff .
- .Pp
- The only data not using this envelope are the initial handshake between
- client and server.
- .Ss File list
- A central part of the protocol is the file list, which is generated by
- the sender.
- It consists of all files that must be sent to the receiver, either
- explicitly as given or recursively generated.
- .Pp
- The file list itself consists of filenames and attributes (mode, time,
- size, etc.).
- Filenames must be relative to the destination root and not be absolute
- or contain backtracking.
- So if a file is given to the sender as
- .Pa ../../foo/bar ,
- it must be sent as
- .Pa foo/bar .
- .Pp
- The file list should be cleaned of inappropriate files prior to sending.
- For example, if
- .Fl l
- is not specified, symbolic links may be omitted.
- Directory entries without
- .Fl r
- may also be omitted.
- Duplicates may be omitted.
- .Pp
- The receiver
- .Em must not
- assume that the file list is clean.
- It should not omit inappropriate files from the file list (which would
- affect the indexing), but may omit them during processing.
- .Pp
- Prior to be sent from sender to receiver, and upon being received, the
- file list must be lexicographically sorted such as with
- .Xr strcmp 3 .
- Subsequent references to the file are by index in the sorted list.
- .Ss Client process
- The client can operate in sender or receiver mode depending upon the
- command-line source and destination.
- .Pp
- If the destination directory (sink) is remote, the client is in sender
- mode: the client will push its data to the server.
- If the source file is remote, it is in receiver mode: the server pushes
- to the client.
- If neither are remote, the client operates in sender mode.
- These are all mutually exclusive.
- .Pp
- When the client starts, regardless its mode, it first handshakes the
- server.
- This exchange is
- .Em not
- multiplexed.
- .Pp
- .Bl -enum -compact
- .It
- send local version (integer)
- .It
- receive remote version (integer)
- .It
- receive random seed (integer)
- .El
- .Pp
- Following this, the client multiplexes when reading from the server.
- Transmissions sent from client to server are not multiplexed.
- It then enters the
- .Sx Update exchange
- protocol.
- .Ss Server process
- The server can operate in sender or receiver mode depending upon how the
- client starts the server.
- This may be directly from the parent process (when invoked for local
- files) or indirectly via a remote shell.
- .Pp
- When in sender mode, the server pushes data to the client.
- (This is equivalent to receiver mode for the client.)
- In receiver, the opposite is true.
- .Pp
- When the server starts, regardless the mode, it first handshakes the
- client.
- This exchange is
- .Em not
- multiplexed.
- .Pp
- .Bl -enum -compact
- .It
- send local version (integer)
- .It
- receive remote version (integer)
- .It
- send random seed (integer)
- .El
- .Pp
- Following this, the server multiplexes when writing to the client.
- (Transmissions received from the client are not multiplexed.)
- It then enters the
- .Sx Update exchange
- protocol.
- .Ss Update exchange
- When the client or server is in sender mode, it begins by conditionally
- sending the exclusion list.
- At this time, this is always empty.
- .Pp
- .Bl -enum -compact
- .It
- if
- .Fl -delete
- and the client, exclusion list zero (integer)
- .El
- .Pp
- It then sends the
- .Sx File list .
- Prior to being sent, the file list should be lexicographically sorted.
- .Pp
- .Bl -enum -compact
- .It
- status byte (integer)
- .It
- inherited filename length (optional, byte)
- .It
- filename length (integer or byte)
- .It
- file (byte array)
- .It
- file length (long)
- .It
- file modification time (optional, time_t, integer)
- .It
- file mode (optional, mode_t, integer)
- .It
- if
- .Fl o ,
- the user id (integer)
- .It
- if
- .Fl g ,
- the group id (integer)
- .It
- if a special file and
- .Fl D ,
- the device
- .Dq rdev
- type (integer)
- .It
- if a symbolic link and
- .Fl l ,
- the link target's length (integer)
- .It
- if a symbolic link and
- .Fl l ,
- the link target (byte array)
- .El
- .Pp
- The status byte may consist of the following bits and determines which
- of the optional fields are transmitted.
- .Pp
- .Bl -tag -compact -width Ds
- .It 0x01
- A top-level directory.
- (Only applies to directory files.)
- If specified, the matching local directory is for deletions.
- .It 0x02
- Do not send the file mode: it is a repeat of the last file's mode.
- .It 0x08
- Like
- .Li 0x02 ,
- but for the user id.
- .It 0x10
- Like
- .Li 0x02 ,
- but for the group id.
- .It 0x20
- Inherit some of the prior file name.
- Enables the inherited filename length transmission.
- .It 0x40
- Use full integer length for file name.
- Otherwise, use only the byte length.
- .It 0x80
- Do not send the file modification time: it is a repeat of the last
- file's.
- .El
- .Pp
- If the status byte is zero, the file-list has terminated.
- .Pp
- If
- .Fl o
- has been specified, the sender sends the list of all users encountered
- in the file list.
- Identifier zero
- .Pq Qq root
- is never transmitted, as it would prematurely end the list.
- This list may be incomplete or empty: the server is not obligated to
- properly fill it in with all relevant users.
- .Pp
- .Bl -enum -compact
- .It
- user identifier or zero to indicate end of set (integer)
- .It
- non-zero length of user name (byte)
- .It
- user name (prior length)
- .El
- .Pp
- The same sequence is then sent for groups if
- .Fl g
- has been specified.
- .Pp
- The sender then sends any IO error values, which for
- .Xr openrsync 1
- is always zero.
- .Pp
- .Bl -enum -compact
- .It
- constant zero (integer)
- .El
- .Pp
- The server sender then reads the exclusion list, which is always zero.
- .Pp
- .Bl -enum -compact
- .It
- if server, constant zero (integer)
- .El
- .Pp
- Following that, the sender receives data regarding the receiver's copy
- of the file list contents.
- This data is not ordered in any way.
- Each of these requests starts as follows:
- .Pp
- .Bl -enum -compact
- .It
- file index or -1 to signal a change of phase (integer)
- .El
- .Pp
- The phase starts in phase 1, then proceeds to phase 2, and phase 3
- signals an end of transmission (no subsequent blocks).
- If a phase change occurs, the sender must write back the -1 constant
- integer value and increment its phase state.
- .Pp
- Blocks are read as follows:
- .Pp
- .Bl -enum -compact
- .It
- block index (integer)
- .El
- .Pp
- In
- .Pq Fl n
- mode, the sender may immediately write back the index (integer) to skip
- the following.
- .Pp
- .Bl -enum -compact
- .It
- number of blocks (integer)
- .It
- block length in the file (integer)
- .It
- long checksum length (integer)
- .It
- terminal (remainder) block length (integer)
- .El
- .Pp
- And for each block:
- .Pp
- .Bl -enum -compact
- .It
- short checksum (integer)
- .It
- long checksum (bytes of checksum length)
- .El
- .Pp
- The client then compares the two files, block by block, and updates the
- server with mismatches as follows.
- .Pp
- .Bl -enum -compact
- .It
- file index (integer)
- .It
- number of blocks (integer)
- .It
- block length (integer)
- .It
- long checksum length (integer)
- .It
- remainder block length (integer)
- .El
- .Pp
- Then for each block:
- .Pp
- .Bl -enum -compact
- .It
- data chunk size (integer)
- .It
- data chunk (bytes)
- .It
- block index subsequent to chunk or zero for finished (integer)
- .El
- .Pp
- Following this sequence, the sender sends the following:
- .Pp
- .Bl -enum -compact
- .It
- whole-file long checksum (16 bytes)
- .El
- .Pp
- The sender then either handles the next queued file or, if the receiver
- has written a phase change, the phase change step.
- .Pp
- If the sender is the server and
- .Fl v
- has been specified, the sender must send statistics.
- .Pp
- .Bl -enum -compact
- .It
- total bytes read (long)
- .It
- total bytes written (long)
- .It
- total size of files (long)
- .El
- .Pp
- Finally, the sender must read a final constant-value integer.
- .Pp
- .Bl -enum -compact
- .It
- end-of-sequence -1 value (integer)
- .El
- .Pp
- If in receiver mode, the inverse above (write instead of read, read
- instead of write) is performed.
- .Pp
- The receiver begins by conditionally writing, then reading, the
- exclusion list count, which is always zero.
- .Pp
- .Bl -enum -compact
- .It
- if client, send zero (integer)
- .It
- if receiver and
- .Fl -delete ,
- read zero (integer)
- .El
- .Pp
- The receiver then proceeds with reading the
- .Sx File list
- as already
- defined.
- Following the list, the receiver reads the IO error, which must be zero.
- .Pp
- .Bl -enum -compact
- .It
- constant zero (integer)
- .El
- .Pp
- The receiver must then sort the file names lexicographically.
- .Pp
- If there are no files in the file list at this time, the receiver must
- exit prior to sending per-file data.
- It then proceeds with the file blocks.
- .Pp
- For file blocks, the receiver must look at each file that is not up to
- date, defined by having the same file size and timestamp, and send it to
- the server.
- Symbolic links and directory entries are never sent to the server.
- .Pp
- After the second phase has completed and prior to writing the
- end-of-data signal, the client receiver reads statistics.
- This is only performed with
- .Pq Fl v .
- .Pp
- .Bl -enum -compact
- .It
- total bytes read (long)
- .It
- total bytes written (long)
- .It
- total size of files (long)
- .El
- .Pp
- Finally, the receiver must send the constant end-of-sequence marker.
- .Pp
- .Bl -enum -compact
- .It
- end-of-sequence -1 value (integer)
- .El
- .Ss Sender and receiver asynchrony
- The sender and receiver need not work in lockstep.
- The receiver may send file update requests as quickly as it parses them,
- and respond to the sender's update notices on demand.
- Similarly, the sender may read as many update requests as it can, and
- service them in any order it wishes.
- .Pp
- The sender and receiver synchronise state only at the end of phase.
- .Pp
- The reference
- .Xr rsync 1
- takes advantage of this with a two-process receiver, one for sending
- update requests (the generator) and another for receiving.
- .Xr openrsync 1
- uses an event-loop model instead.
- .\" .Sh CONTEXT
- .\" For section 9 functions only.
- .\" .Sh RETURN VALUES
- .\" For sections 2, 3, and 9 function return values only.
- .\" .Sh ENVIRONMENT
- .\" For sections 1, 6, 7, and 8 only.
- .\" .Sh FILES
- .\" .Sh EXIT STATUS
- .\" For sections 1, 6, and 8 only.
- .\" .Sh EXAMPLES
- .\" .Sh DIAGNOSTICS
- .\" For sections 1, 4, 6, 7, 8, and 9 printf/stderr messages only.
- .\" .Sh ERRORS
- .\" For sections 2, 3, 4, and 9 errno settings only.
- .Sh SEE ALSO
- .Xr openrsync 1 ,
- .Xr rsync 1 ,
- .Xr rsyncd 5
- .\" .Sh STANDARDS
- .\" .Sh HISTORY
- .\" .Sh AUTHORS
- .\" .Sh CAVEATS
- .Sh BUGS
- Time values are sent as 32-bit integers.
- .Pp
- When in server mode
- .Em and
- when communicating to a client with a newer protocol (>27), the phase
- change integer (-1) acknowledgement must be sent twice by the sender.
- The is probably a bug in the reference implementation.