add docs for replica/server protocol

This commit is contained in:
Dustin J. Mitchell 2020-11-26 14:10:46 -05:00
parent 2dae271851
commit 3fb2327a5b
4 changed files with 228 additions and 127 deletions

View file

@ -7,5 +7,7 @@
- [Replica Storage](./storage.md)
- [Task Database](./taskdb.md)
- [Tasks](./tasks.md)
- [Synchronization](./sync.md)
- [Synchronization and the Sync Server](./sync.md)
- [Synchronization Model](./sync-model.md)
* [Server-Replica Protocol](./sync-protocol.md)
- [Planned Functionality](./plans.md)

128
docs/src/sync-model.md Normal file
View file

@ -0,0 +1,128 @@
# Synchronization Model
The [task database](./taskdb.md) also implements synchronization.
Synchronization occurs between disconnected replicas, mediated by a server.
The replicas never communicate directly with one another.
The server does not have access to the task data; it sees only opaque blobs of data with a small amount of metadata.
The synchronization process is a critical part of the task database's functionality, and it cannot function efficiently without occasional synchronization operations
## Operational Transforms
Synchronization is based on [operational transformation](https://en.wikipedia.org/wiki/Operational_transformation).
This section will assume some familiarity with the concept.
## State and Operations
At a given time, the set of tasks in a replica's storage is the essential "state" of that replica.
All modifications to that state occur via operations, as defined in [Replica Storage](./storage.md).
We can draw a network, or graph, with the nodes representing states and the edges representing operations.
For example:
```text
o -- State: {abc-d123: 'get groceries', priority L}
|
| -- Operation: set abc-d123 priority to H
|
o -- State: {abc-d123: 'get groceries', priority H}
```
For those familiar with distributed version control systems, a state is analogous to a revision, while an operation is analogous to a commit.
Fundamentally, synchronization involves all replicas agreeing on a single, linear sequence of operations and the state that those operations create.
Since the replicas are not connected, each may have additional operations that have been applied locally, but which have not yet been agreed on.
The synchronization process uses operational transformation to "linearize" those operations.
This process is analogous (vaguely) to rebasing a sequence of Git commits.
### Versions
Occasionally, database states are given a name (that takes the form of a UUID).
The system as a whole (all replicas) constructs a branch-free sequence of versions and the operations that separate each version from the next.
The version with the nil UUID is implicitly the empty database.
The server stores the operations to change a state from a "parent" version to a "child" version, and provides that information as needed to replicas.
Replicas use this information to update their local task databases, and to generate new versions to send to the server.
Replicas generate a new version to transmit local changes to the server.
The changes are represented as a sequence of operations with the state resulting from the final operation corresponding to the version.
In order to keep the versions in a single sequence, the server will only accept a proposed version from a replica if its parent version matches the latest version on the server.
In the non-conflict case (such as with a single replica), then, a replica's synchronization process involves gathering up the operations it has accumulated since its last synchronization; bundling those operations into a version; and sending that version to the server.
### Replica Invariant
The replica's [storage](./storage.md) contains the current state in `tasks`, the as-yet un-synchronized operations in `operations`, and the last version at which synchronization occurred in `base_version`.
The replica's un-synchronized operations are already reflected in its local `tasks`, so the following invariant holds:
> Applying `operations` to the set of tasks at `base_version` gives a set of tasks identical
> to `tasks`.
### Transformation
When the latest version on the server contains operations that are not present in the replica, then the states have diverged.
For example:
```text
o -- version N
w|\a
o o
x| \b
o o
y| \c
o o -- replica's local state
z|
o -- version N+1
```
(diagram notation: `o` designates a state, lower-case letters designate operations, and versions are presented as if they were numbered sequentially)
In this situation, the replica must "rebase" the local operations onto the latest version from the server and try again.
This process is performed using operational transformation (OT).
The result of this transformation is a sequence of operations based on the latest version, and a sequence of operations the replica can apply to its local task database to reach the same state
Continuing the example above, the resulting operations are shown with `'`:
```text
o -- version N
w|\a
o o
x| \b
o o
y| \c
o o -- replica's intermediate local state
z| |w'
o-N+1 o
a'\ |x'
o o
b'\ |y'
o o
c'\|z'
o -- version N+2
```
The replica applies w' through z' locally, and sends a' through c' to the server as the operations to generate version N+2.
Either path through this graph, a-b-c-w'-x'-y'-z' or a'-b'-c'-w-x-y-z, must generate *precisely* the same final state at version N+2.
Careful selection of the operations and the transformation function ensure this.
See the comments in the source code for the details of how this transformation process is implemented.
## Synchronization Process
To perform a synchronization, the replica first requests the child version of `base_version` from the server (GetChildVersion).
It applies that version to its local `tasks`, rebases its local `operations` as described above, and updates `base_version`.
The replica repeats this process until the server indicates no additional child versions exist.
If there are no un-synchronized local operations, the process is complete.
Otherwise, the replica creates a new version containing its local operations, giving its `base_version` as the parent version, and transmits that to the server (AddVersion).
In most cases, this will succeed, but if another replica has created a new version in the interim, then the new version will conflict with that other replica's new version and the server will respond with the new expected parent version.
In this case, the process repeats.
If the server indicates a conflict twice with the same expected base version, that is an indication that the replica has diverged (something serious has gone wrong).
## Servers
A replica depends on periodic synchronization for performant operation.
Without synchronization, its list of pending operations would grow indefinitely, and tasks could never be expired.
So all replicas, even "singleton" replicas which do not replicate task data with any other replica, must synchronize periodically.
TaskChampion provides a `LocalServer` for this purpose.
It implements the `get_child_version` and `add_version` operations as described, storing data on-disk locally, all within the `task` binary.

92
docs/src/sync-protocol.md Normal file
View file

@ -0,0 +1,92 @@
# Server-Replica Protocol
The server-replica protocol is defined abstractly in terms of request/response transactions from the replica to the server.
This is made concrete in an HTTP representation.
The protocol builds on the model presented in the previous chapter, and in particular on the synchronization process.
## Clients
From the server's perspective, replicas are indistinguishable, so this protocol uses the term "client" to refer generically to all replicas replicating a single task history.
## Server
For each client, the server is responsible for storing the task history, in the form of a branch-free sequence of versions.
For each client, it stores a set of versions as well as the latest version ID, defaulting to the nil UUID.
Each version has a version ID, a parent version ID, and a history segment (opaque data containing the operations for that version).
The server should maintain the following invariants:
1. Given a client c, c.latestVersion is nil or exists in the set of versions.
1. Given versions v1 and v2 for a client, with v1.versionId != v2.versionId and v1.parentVersionId != nil, v1.parentVersionId != v2.parentVersionId.
In other words, versions do not branch.
Note that versions form a linked list beginning with the version stored in he client.
This linked list need not continue back to a version with v.parentVersionId = nil.
It may end at any point when v.parentVersionId is not found in the set of Versions.
This observation allows the server to discard older versions.
## Transactions
### AddVersion
The AddVersion transaction requests that the server add a new version to the client's task history.
The request contains the following;
* parent version ID
* history segment
The server determines whether the new version is acceptable, atomically with respect to other requests for the same client.
If it has no versions for the client, it accepts the version.
If it already has one or more versions for the client, then it accepts the version only if the given parent version ID matches its stored latest parent ID.
If the version is accepted, the server generates a new version ID for it.
The version is added to the set of versions for the client, the client's latest version ID is set to the new version ID.
The new version ID is returned in the response to the client.
If the version is not accepted, the server makes no changes, but responds to the client with a conflict indication containing the latest version ID.
The client may then "rebase" its operations and try again.
Note that if a client receives two conflict responses with the same parent version ID, it is an indication that the client's version history has diverged from that on the server.
### GetChildVersion
The GetChildVersion transaction is a read-only request for a version.
The request consists of a parent version ID.
The server searches its set of versions for a version with the given parent ID.
If found, it returns the version's
* version ID,
* parent version ID (matching that in the request), and
* history segment.
If not found, the server returns a negative response.
## HTTP Representation
The transactions above are realized for an HTTP server at `<origin>` using the HTTP requests and responses described here.
The `origin` *should* be an HTTPS endpoint on general principle, but nothing in the functonality or security of the protocol depends on connection encryption.
The replica identifies itself to the server using a `clientId` in the form of a UUID.
### AddVersion
The request is a `POST` to `<origin>/client/<clientId>/add-version/<parentVersionId>`.
The request body contains the history segment, optionally encoded using any encoding supported by actix-web.
The content-type must be `application/vnd.taskchampion.history-segment`.
The success response is a 200 OK with an empty body.
The new version ID appears in the `X-Version-Id` header.
On conflict, the response is a 409 CONFLICT with an empty body.
The expected parent version ID appears in the `X-Parent-Version-Id` header.
Other error responses (4xx or 5xx) may be returned and should be treated appropriately to their meanings in the HTTP specification.
### GetChildVersion
The request is a `GET` to `<origin>/client/<clientId>/get-child-version/<parentVersionId>`.
The response is 404 NOT FOUND if no such version exists.
Otherwise, the response is a 200 OK.
The version's history segment is returned in the response body, with content-type `application/vnd.taskchampion.history-segment`.
The version ID appears in the `X-Version-Id` header.
The response body may be encoded, in accordance with any `Accept-Encoding` header in the request.

View file

@ -1,128 +1,7 @@
# Synchronization
# Synchronization and the Sync Server
The [task database](./taskdb.md) also implements synchronization.
Synchronization occurs between disconnected replicas, mediated by a server.
The replicas never communicate directly with one another.
The server does not have access to the task data; it sees only opaque blobs of data with a small amount of metadata.
This section covers *synchronization* of *replicas* containing the same set of tasks.
A replica is can perform all operations locally without connecting to a sync server, then share those operations with other replicas when it connects.
Sync is a critical feature of TaskChampion, allowing users to consult and update the same task list on multiple devices, without requiring constant connection.
The synchronization process is a critical part of the task database's functionality, and it cannot function efficiently without occasional synchronization operations
## Operational Transformations
Synchronization is based on [operational transformation](https://en.wikipedia.org/wiki/Operational_transformation).
This section will assume some familiarity with the concept.
## State and Operations
At a given time, the set of tasks in a replica's storage is the essential "state" of that replica.
All modifications to that state occur via operations, as defined in [Replica Storage](./storage.md).
We can draw a network, or graph, with the nodes representing states and the edges representing operations.
For example:
```text
o -- State: {abc-d123: 'get groceries', priority L}
|
| -- Operation: set abc-d123 priority to H
|
o -- State: {abc-d123: 'get groceries', priority H}
```
For those familiar with distributed version control systems, a state is analogous to a revision, while an operation is analogous to a commit.
Fundamentally, synchronization involves all replicas agreeing on a single, linear sequence of operations and the state that those operations create.
Since the replicas are not connected, each may have additional operations that have been applied locally, but which have not yet been agreed on.
The synchronization process uses operational transformation to "linearize" those operations.
This process is analogous (vaguely) to rebasing a sequence of Git commits.
### Versions
Occasionally, database states are given a name (that takes the form of a UUID).
The system as a whole (all replicas) constructs a branch-free sequence of versions and the operations that separate each version from the next.
The version with the nil UUID is implicitly the empty database.
The server stores the operations to change a state from a "parent" version to a "child" version, and provides that information as needed to replicas.
Replicas use this information to update their local task databases, and to generate new versions to send to the server.
Replicas generate a new version to transmit local changes to the server.
The changes are represented as a sequence of operations with the state resulting from the final operation corresponding to the version.
In order to keep the versions in a single sequence, the server will only accept a proposed version from a replica if its parent version matches the latest version on the server.
In the non-conflict case (such as with a single replica), then, a replica's synchronization process involves gathering up the operations it has accumulated since its last synchronization; bundling those operations into a version; and sending that version to the server.
### Replica Invariant
The replica's [storage](./storage.md) contains the current state in `tasks`, the as-yet un-synchronized operations in `operations`, and the last version at which synchronization occurred in `base_version`.
The replica's un-synchronized operations are already reflected in its local `tasks`, so the following invariant holds:
> Applying `operations` to the set of tasks at `base_version` gives a set of tasks identical
> to `tasks`.
### Transformation
When the latest version on the server contains operations that are not present in the replica, then the states have diverged.
For example:
```text
o -- version N
w|\a
o o
x| \b
o o
y| \c
o o -- replica's local state
z|
o -- version N+1
```
(diagram notation: `o` designates a state, lower-case letters designate operations, and versions are presented as if they were numbered sequentially)
In this situation, the replica must "rebase" the local operations onto the latest version from the server and try again.
This process is performed using operational transformation (OT).
The result of this transformation is a sequence of operations based on the latest version, and a sequence of operations the replica can apply to its local task database to reach the same state
Continuing the example above, the resulting operations are shown with `'`:
```text
o -- version N
w|\a
o o
x| \b
o o
y| \c
o o -- replica's intermediate local state
z| |w'
o-N+1 o
a'\ |x'
o o
b'\ |y'
o o
c'\|z'
o -- version N+2
```
The replica applies w' through z' locally, and sends a' through c' to the server as the operations to generate version N+2.
Either path through this graph, a-b-c-w'-x'-y'-z' or a'-b'-c'-w-x-y-z, must generate *precisely* the same final state at version N+2.
Careful selection of the operations and the transformation function ensure this.
See the comments in the source code for the details of how this transformation process is implemented.
## Synchronization Process
To perform a synchronization, the replica first requests the child version of `base_version` from the server (`get_child_version`).
It applies that version to its local `tasks`, rebases its local `operations` as described above, and updates `base_version`.
The replica repeats this process until the server indicates no additional child versions exist.
If there are no un-synchronized local operations, the process is complete.
Otherwise, the replica creates a new version containing its local operations, giving its `base_version` as the parent version, and transmits that to the server (`add_version`).
In most cases, this will succeed, but if another replica has created a new version in the interim, then the new version will conflict with that other replica's new version and the server will respond with the new expected parent version.
In this case, the process repeats.
If the server indicates a conflict twice with the same expected base version, that is an indication that the replica has diverged (something serious has gone wrong).
## Servers
A replica depends on periodic synchronization for performant operation.
Without synchronization, its list of pending operations would grow indefinitely, and tasks could never be expired.
So all replicas, even "singleton" replicas which do not replicate task data with any other replica, must synchronize periodically.
TaskChampion provides a `LocalServer` for this purpose.
It implements the `get_child_version` and `add_version` operations as described, storing data on-disk locally, all within the `task` binary.
This is a complex topic, and the section is broken into several chapters, beginning at the lower levels of the implementation and working up.