mirror of
https://github.com/GothenburgBitFactory/taskwarrior.git
synced 2025-07-07 20:06:36 +02:00
add docs for replica/server protocol
This commit is contained in:
parent
2dae271851
commit
3fb2327a5b
4 changed files with 228 additions and 127 deletions
|
@ -7,5 +7,7 @@
|
|||
- [Replica Storage](./storage.md)
|
||||
- [Task Database](./taskdb.md)
|
||||
- [Tasks](./tasks.md)
|
||||
- [Synchronization](./sync.md)
|
||||
- [Synchronization and the Sync Server](./sync.md)
|
||||
- [Synchronization Model](./sync-model.md)
|
||||
* [Server-Replica Protocol](./sync-protocol.md)
|
||||
- [Planned Functionality](./plans.md)
|
||||
|
|
128
docs/src/sync-model.md
Normal file
128
docs/src/sync-model.md
Normal file
|
@ -0,0 +1,128 @@
|
|||
# Synchronization Model
|
||||
|
||||
The [task database](./taskdb.md) also implements synchronization.
|
||||
Synchronization occurs between disconnected replicas, mediated by a server.
|
||||
The replicas never communicate directly with one another.
|
||||
The server does not have access to the task data; it sees only opaque blobs of data with a small amount of metadata.
|
||||
|
||||
The synchronization process is a critical part of the task database's functionality, and it cannot function efficiently without occasional synchronization operations
|
||||
|
||||
## Operational Transforms
|
||||
|
||||
Synchronization is based on [operational transformation](https://en.wikipedia.org/wiki/Operational_transformation).
|
||||
This section will assume some familiarity with the concept.
|
||||
|
||||
## State and Operations
|
||||
|
||||
At a given time, the set of tasks in a replica's storage is the essential "state" of that replica.
|
||||
All modifications to that state occur via operations, as defined in [Replica Storage](./storage.md).
|
||||
We can draw a network, or graph, with the nodes representing states and the edges representing operations.
|
||||
For example:
|
||||
|
||||
```text
|
||||
o -- State: {abc-d123: 'get groceries', priority L}
|
||||
|
|
||||
| -- Operation: set abc-d123 priority to H
|
||||
|
|
||||
o -- State: {abc-d123: 'get groceries', priority H}
|
||||
```
|
||||
|
||||
For those familiar with distributed version control systems, a state is analogous to a revision, while an operation is analogous to a commit.
|
||||
|
||||
Fundamentally, synchronization involves all replicas agreeing on a single, linear sequence of operations and the state that those operations create.
|
||||
Since the replicas are not connected, each may have additional operations that have been applied locally, but which have not yet been agreed on.
|
||||
The synchronization process uses operational transformation to "linearize" those operations.
|
||||
This process is analogous (vaguely) to rebasing a sequence of Git commits.
|
||||
|
||||
### Versions
|
||||
|
||||
Occasionally, database states are given a name (that takes the form of a UUID).
|
||||
The system as a whole (all replicas) constructs a branch-free sequence of versions and the operations that separate each version from the next.
|
||||
The version with the nil UUID is implicitly the empty database.
|
||||
|
||||
The server stores the operations to change a state from a "parent" version to a "child" version, and provides that information as needed to replicas.
|
||||
Replicas use this information to update their local task databases, and to generate new versions to send to the server.
|
||||
|
||||
Replicas generate a new version to transmit local changes to the server.
|
||||
The changes are represented as a sequence of operations with the state resulting from the final operation corresponding to the version.
|
||||
In order to keep the versions in a single sequence, the server will only accept a proposed version from a replica if its parent version matches the latest version on the server.
|
||||
|
||||
In the non-conflict case (such as with a single replica), then, a replica's synchronization process involves gathering up the operations it has accumulated since its last synchronization; bundling those operations into a version; and sending that version to the server.
|
||||
|
||||
### Replica Invariant
|
||||
|
||||
The replica's [storage](./storage.md) contains the current state in `tasks`, the as-yet un-synchronized operations in `operations`, and the last version at which synchronization occurred in `base_version`.
|
||||
|
||||
The replica's un-synchronized operations are already reflected in its local `tasks`, so the following invariant holds:
|
||||
|
||||
> Applying `operations` to the set of tasks at `base_version` gives a set of tasks identical
|
||||
> to `tasks`.
|
||||
|
||||
### Transformation
|
||||
|
||||
When the latest version on the server contains operations that are not present in the replica, then the states have diverged.
|
||||
For example:
|
||||
|
||||
```text
|
||||
o -- version N
|
||||
w|\a
|
||||
o o
|
||||
x| \b
|
||||
o o
|
||||
y| \c
|
||||
o o -- replica's local state
|
||||
z|
|
||||
o -- version N+1
|
||||
```
|
||||
|
||||
(diagram notation: `o` designates a state, lower-case letters designate operations, and versions are presented as if they were numbered sequentially)
|
||||
|
||||
In this situation, the replica must "rebase" the local operations onto the latest version from the server and try again.
|
||||
This process is performed using operational transformation (OT).
|
||||
The result of this transformation is a sequence of operations based on the latest version, and a sequence of operations the replica can apply to its local task database to reach the same state
|
||||
Continuing the example above, the resulting operations are shown with `'`:
|
||||
|
||||
```text
|
||||
o -- version N
|
||||
w|\a
|
||||
o o
|
||||
x| \b
|
||||
o o
|
||||
y| \c
|
||||
o o -- replica's intermediate local state
|
||||
z| |w'
|
||||
o-N+1 o
|
||||
a'\ |x'
|
||||
o o
|
||||
b'\ |y'
|
||||
o o
|
||||
c'\|z'
|
||||
o -- version N+2
|
||||
```
|
||||
|
||||
The replica applies w' through z' locally, and sends a' through c' to the server as the operations to generate version N+2.
|
||||
Either path through this graph, a-b-c-w'-x'-y'-z' or a'-b'-c'-w-x-y-z, must generate *precisely* the same final state at version N+2.
|
||||
Careful selection of the operations and the transformation function ensure this.
|
||||
|
||||
See the comments in the source code for the details of how this transformation process is implemented.
|
||||
|
||||
## Synchronization Process
|
||||
|
||||
To perform a synchronization, the replica first requests the child version of `base_version` from the server (GetChildVersion).
|
||||
It applies that version to its local `tasks`, rebases its local `operations` as described above, and updates `base_version`.
|
||||
The replica repeats this process until the server indicates no additional child versions exist.
|
||||
If there are no un-synchronized local operations, the process is complete.
|
||||
|
||||
Otherwise, the replica creates a new version containing its local operations, giving its `base_version` as the parent version, and transmits that to the server (AddVersion).
|
||||
In most cases, this will succeed, but if another replica has created a new version in the interim, then the new version will conflict with that other replica's new version and the server will respond with the new expected parent version.
|
||||
In this case, the process repeats.
|
||||
If the server indicates a conflict twice with the same expected base version, that is an indication that the replica has diverged (something serious has gone wrong).
|
||||
|
||||
## Servers
|
||||
|
||||
A replica depends on periodic synchronization for performant operation.
|
||||
Without synchronization, its list of pending operations would grow indefinitely, and tasks could never be expired.
|
||||
So all replicas, even "singleton" replicas which do not replicate task data with any other replica, must synchronize periodically.
|
||||
|
||||
TaskChampion provides a `LocalServer` for this purpose.
|
||||
It implements the `get_child_version` and `add_version` operations as described, storing data on-disk locally, all within the `task` binary.
|
92
docs/src/sync-protocol.md
Normal file
92
docs/src/sync-protocol.md
Normal file
|
@ -0,0 +1,92 @@
|
|||
# Server-Replica Protocol
|
||||
|
||||
The server-replica protocol is defined abstractly in terms of request/response transactions from the replica to the server.
|
||||
This is made concrete in an HTTP representation.
|
||||
|
||||
The protocol builds on the model presented in the previous chapter, and in particular on the synchronization process.
|
||||
|
||||
## Clients
|
||||
|
||||
From the server's perspective, replicas are indistinguishable, so this protocol uses the term "client" to refer generically to all replicas replicating a single task history.
|
||||
|
||||
## Server
|
||||
|
||||
For each client, the server is responsible for storing the task history, in the form of a branch-free sequence of versions.
|
||||
|
||||
For each client, it stores a set of versions as well as the latest version ID, defaulting to the nil UUID.
|
||||
Each version has a version ID, a parent version ID, and a history segment (opaque data containing the operations for that version).
|
||||
The server should maintain the following invariants:
|
||||
|
||||
1. Given a client c, c.latestVersion is nil or exists in the set of versions.
|
||||
1. Given versions v1 and v2 for a client, with v1.versionId != v2.versionId and v1.parentVersionId != nil, v1.parentVersionId != v2.parentVersionId.
|
||||
In other words, versions do not branch.
|
||||
|
||||
Note that versions form a linked list beginning with the version stored in he client.
|
||||
This linked list need not continue back to a version with v.parentVersionId = nil.
|
||||
It may end at any point when v.parentVersionId is not found in the set of Versions.
|
||||
This observation allows the server to discard older versions.
|
||||
|
||||
## Transactions
|
||||
|
||||
### AddVersion
|
||||
|
||||
The AddVersion transaction requests that the server add a new version to the client's task history.
|
||||
The request contains the following;
|
||||
|
||||
* parent version ID
|
||||
* history segment
|
||||
|
||||
The server determines whether the new version is acceptable, atomically with respect to other requests for the same client.
|
||||
If it has no versions for the client, it accepts the version.
|
||||
If it already has one or more versions for the client, then it accepts the version only if the given parent version ID matches its stored latest parent ID.
|
||||
|
||||
If the version is accepted, the server generates a new version ID for it.
|
||||
The version is added to the set of versions for the client, the client's latest version ID is set to the new version ID.
|
||||
The new version ID is returned in the response to the client.
|
||||
|
||||
If the version is not accepted, the server makes no changes, but responds to the client with a conflict indication containing the latest version ID.
|
||||
The client may then "rebase" its operations and try again.
|
||||
Note that if a client receives two conflict responses with the same parent version ID, it is an indication that the client's version history has diverged from that on the server.
|
||||
|
||||
### GetChildVersion
|
||||
|
||||
The GetChildVersion transaction is a read-only request for a version.
|
||||
The request consists of a parent version ID.
|
||||
The server searches its set of versions for a version with the given parent ID.
|
||||
If found, it returns the version's
|
||||
|
||||
* version ID,
|
||||
* parent version ID (matching that in the request), and
|
||||
* history segment.
|
||||
|
||||
If not found, the server returns a negative response.
|
||||
|
||||
## HTTP Representation
|
||||
|
||||
The transactions above are realized for an HTTP server at `<origin>` using the HTTP requests and responses described here.
|
||||
The `origin` *should* be an HTTPS endpoint on general principle, but nothing in the functonality or security of the protocol depends on connection encryption.
|
||||
|
||||
The replica identifies itself to the server using a `clientId` in the form of a UUID.
|
||||
|
||||
### AddVersion
|
||||
|
||||
The request is a `POST` to `<origin>/client/<clientId>/add-version/<parentVersionId>`.
|
||||
The request body contains the history segment, optionally encoded using any encoding supported by actix-web.
|
||||
The content-type must be `application/vnd.taskchampion.history-segment`.
|
||||
|
||||
The success response is a 200 OK with an empty body.
|
||||
The new version ID appears in the `X-Version-Id` header.
|
||||
|
||||
On conflict, the response is a 409 CONFLICT with an empty body.
|
||||
The expected parent version ID appears in the `X-Parent-Version-Id` header.
|
||||
|
||||
Other error responses (4xx or 5xx) may be returned and should be treated appropriately to their meanings in the HTTP specification.
|
||||
|
||||
### GetChildVersion
|
||||
|
||||
The request is a `GET` to `<origin>/client/<clientId>/get-child-version/<parentVersionId>`.
|
||||
The response is 404 NOT FOUND if no such version exists.
|
||||
Otherwise, the response is a 200 OK.
|
||||
The version's history segment is returned in the response body, with content-type `application/vnd.taskchampion.history-segment`.
|
||||
The version ID appears in the `X-Version-Id` header.
|
||||
The response body may be encoded, in accordance with any `Accept-Encoding` header in the request.
|
131
docs/src/sync.md
131
docs/src/sync.md
|
@ -1,128 +1,7 @@
|
|||
# Synchronization
|
||||
# Synchronization and the Sync Server
|
||||
|
||||
The [task database](./taskdb.md) also implements synchronization.
|
||||
Synchronization occurs between disconnected replicas, mediated by a server.
|
||||
The replicas never communicate directly with one another.
|
||||
The server does not have access to the task data; it sees only opaque blobs of data with a small amount of metadata.
|
||||
This section covers *synchronization* of *replicas* containing the same set of tasks.
|
||||
A replica is can perform all operations locally without connecting to a sync server, then share those operations with other replicas when it connects.
|
||||
Sync is a critical feature of TaskChampion, allowing users to consult and update the same task list on multiple devices, without requiring constant connection.
|
||||
|
||||
The synchronization process is a critical part of the task database's functionality, and it cannot function efficiently without occasional synchronization operations
|
||||
|
||||
## Operational Transformations
|
||||
|
||||
Synchronization is based on [operational transformation](https://en.wikipedia.org/wiki/Operational_transformation).
|
||||
This section will assume some familiarity with the concept.
|
||||
|
||||
## State and Operations
|
||||
|
||||
At a given time, the set of tasks in a replica's storage is the essential "state" of that replica.
|
||||
All modifications to that state occur via operations, as defined in [Replica Storage](./storage.md).
|
||||
We can draw a network, or graph, with the nodes representing states and the edges representing operations.
|
||||
For example:
|
||||
|
||||
```text
|
||||
o -- State: {abc-d123: 'get groceries', priority L}
|
||||
|
|
||||
| -- Operation: set abc-d123 priority to H
|
||||
|
|
||||
o -- State: {abc-d123: 'get groceries', priority H}
|
||||
```
|
||||
|
||||
For those familiar with distributed version control systems, a state is analogous to a revision, while an operation is analogous to a commit.
|
||||
|
||||
Fundamentally, synchronization involves all replicas agreeing on a single, linear sequence of operations and the state that those operations create.
|
||||
Since the replicas are not connected, each may have additional operations that have been applied locally, but which have not yet been agreed on.
|
||||
The synchronization process uses operational transformation to "linearize" those operations.
|
||||
This process is analogous (vaguely) to rebasing a sequence of Git commits.
|
||||
|
||||
### Versions
|
||||
|
||||
Occasionally, database states are given a name (that takes the form of a UUID).
|
||||
The system as a whole (all replicas) constructs a branch-free sequence of versions and the operations that separate each version from the next.
|
||||
The version with the nil UUID is implicitly the empty database.
|
||||
|
||||
The server stores the operations to change a state from a "parent" version to a "child" version, and provides that information as needed to replicas.
|
||||
Replicas use this information to update their local task databases, and to generate new versions to send to the server.
|
||||
|
||||
Replicas generate a new version to transmit local changes to the server.
|
||||
The changes are represented as a sequence of operations with the state resulting from the final operation corresponding to the version.
|
||||
In order to keep the versions in a single sequence, the server will only accept a proposed version from a replica if its parent version matches the latest version on the server.
|
||||
|
||||
In the non-conflict case (such as with a single replica), then, a replica's synchronization process involves gathering up the operations it has accumulated since its last synchronization; bundling those operations into a version; and sending that version to the server.
|
||||
|
||||
### Replica Invariant
|
||||
|
||||
The replica's [storage](./storage.md) contains the current state in `tasks`, the as-yet un-synchronized operations in `operations`, and the last version at which synchronization occurred in `base_version`.
|
||||
|
||||
The replica's un-synchronized operations are already reflected in its local `tasks`, so the following invariant holds:
|
||||
|
||||
> Applying `operations` to the set of tasks at `base_version` gives a set of tasks identical
|
||||
> to `tasks`.
|
||||
|
||||
### Transformation
|
||||
|
||||
When the latest version on the server contains operations that are not present in the replica, then the states have diverged.
|
||||
For example:
|
||||
|
||||
```text
|
||||
o -- version N
|
||||
w|\a
|
||||
o o
|
||||
x| \b
|
||||
o o
|
||||
y| \c
|
||||
o o -- replica's local state
|
||||
z|
|
||||
o -- version N+1
|
||||
```
|
||||
|
||||
(diagram notation: `o` designates a state, lower-case letters designate operations, and versions are presented as if they were numbered sequentially)
|
||||
|
||||
In this situation, the replica must "rebase" the local operations onto the latest version from the server and try again.
|
||||
This process is performed using operational transformation (OT).
|
||||
The result of this transformation is a sequence of operations based on the latest version, and a sequence of operations the replica can apply to its local task database to reach the same state
|
||||
Continuing the example above, the resulting operations are shown with `'`:
|
||||
|
||||
```text
|
||||
o -- version N
|
||||
w|\a
|
||||
o o
|
||||
x| \b
|
||||
o o
|
||||
y| \c
|
||||
o o -- replica's intermediate local state
|
||||
z| |w'
|
||||
o-N+1 o
|
||||
a'\ |x'
|
||||
o o
|
||||
b'\ |y'
|
||||
o o
|
||||
c'\|z'
|
||||
o -- version N+2
|
||||
```
|
||||
|
||||
The replica applies w' through z' locally, and sends a' through c' to the server as the operations to generate version N+2.
|
||||
Either path through this graph, a-b-c-w'-x'-y'-z' or a'-b'-c'-w-x-y-z, must generate *precisely* the same final state at version N+2.
|
||||
Careful selection of the operations and the transformation function ensure this.
|
||||
|
||||
See the comments in the source code for the details of how this transformation process is implemented.
|
||||
|
||||
## Synchronization Process
|
||||
|
||||
To perform a synchronization, the replica first requests the child version of `base_version` from the server (`get_child_version`).
|
||||
It applies that version to its local `tasks`, rebases its local `operations` as described above, and updates `base_version`.
|
||||
The replica repeats this process until the server indicates no additional child versions exist.
|
||||
If there are no un-synchronized local operations, the process is complete.
|
||||
|
||||
Otherwise, the replica creates a new version containing its local operations, giving its `base_version` as the parent version, and transmits that to the server (`add_version`).
|
||||
In most cases, this will succeed, but if another replica has created a new version in the interim, then the new version will conflict with that other replica's new version and the server will respond with the new expected parent version.
|
||||
In this case, the process repeats.
|
||||
If the server indicates a conflict twice with the same expected base version, that is an indication that the replica has diverged (something serious has gone wrong).
|
||||
|
||||
## Servers
|
||||
|
||||
A replica depends on periodic synchronization for performant operation.
|
||||
Without synchronization, its list of pending operations would grow indefinitely, and tasks could never be expired.
|
||||
So all replicas, even "singleton" replicas which do not replicate task data with any other replica, must synchronize periodically.
|
||||
|
||||
TaskChampion provides a `LocalServer` for this purpose.
|
||||
It implements the `get_child_version` and `add_version` operations as described, storing data on-disk locally, all within the `task` binary.
|
||||
This is a complex topic, and the section is broken into several chapters, beginning at the lower levels of the implementation and working up.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue