Add documentation for snapshots

This commit is contained in:
Dustin J. Mitchell 2021-09-28 19:17:20 -04:00
parent 75fd0ff83a
commit eadce9f15a
3 changed files with 126 additions and 8 deletions

View file

@ -19,5 +19,6 @@
* [Tasks](./tasks.md)
* [Synchronization and the Sync Server](./sync.md)
* [Synchronization Model](./sync-model.md)
* [Snapshots](./snapshots.md)
* [Server-Replica Protocol](./sync-protocol.md)
* [Planned Functionality](./plans.md)

48
docs/src/snapshots.md Normal file
View file

@ -0,0 +1,48 @@
# Snapshots
The basic synchronization model described in the previous page has a few shortcomings:
* servers must store an ever-increasing quantity of versions
* a new replica must download all versions since the beginning in order to derive the current state
Snapshots allow TaskChampion to avoid both of these issues.
A snapshot is a copy of the task database at a specific version.
It is created by a replica, encrypted, and stored on the server.
A new replica can simply download a recent snapshot and apply any additional versions synchronized since that snapshot was made.
Servers can delete and reclaim space used by older versions, as long as newer snapshots are available.
## Snapshot Heuristics
A server implementation must answer a few questions:
* How often should snapshots be made?
* When can versions be deleted?
* When can snapshots be deleted?
A critical invariant is that at least one snapshot must exist for any database that does not have a child of the nil version.
This ensures that a new replica can always derive the latest state.
Aside from that invariant, the server implementation can vary in its answers to these questions, with the following considerations:
Snapshots should be made frequently enough that a new replica can initialize quickly.
Existing replicas will fail to synchronize if they request a child version that has been deleted.
This failure can cause data loss if the replica had local changes.
It's conceivable that replicas may not sync for weeks or months if, for example, they are located on a home computer while the user is on holiday.
## Requesting New Snapshots
The server requests snapshots from replicas, indicating an urgency for the request.
Some replicas, such as those running on PCs or servers, can produce a snapshot even at low urgency.
Other replicas, in more restricted environments such as mobile devices, will only produce a snapshot at high urgency.
This saves resources in these restricted environments.
A snapshot must be made on a replica with no unsynchronized operations.
As such, it only makes sense to request a snapshot in response to a successful AddVersion request.
## Handling Deleted Versions
When a replica requests a child version, the response must distinguish two cases:
1. No such child version exists because the replica is up-to-date.
1. No such child version exists because it has been deleted, and the replica must re-initialize itself.
The details of this logic are covered in the [Server-Replica Protocol](./sync-protocol.md).

View file

@ -7,26 +7,36 @@ The protocol builds on the model presented in the previous chapter, and in parti
## Clients
From the server's perspective, replicas are indistinguishable, so this protocol uses the term "client" to refer generically to all replicas replicating a single task history.
From the server's perspective, replicas accessing the same task history are indistinguishable, so this protocol uses the term "client" to refer generically to all replicas replicating a single task history.
Each client is identified and authenticated with a "client key", known only to the server and to the replicas replicating the task history.
## Server
For each client, the server is responsible for storing the task history, in the form of a branch-free sequence of versions.
It also stores the latest snapshot, if any exists.
* versions: a set of {versionId: UUID, parentVersionId: UUID, historySegment: bytes}
* latestVersionId: UUID
* snapshotVersionId: UUID
* snapshot: bytes
For each client, it stores a set of versions as well as the latest version ID, defaulting to the nil UUID.
Each version has a version ID, a parent version ID, and a history segment (opaque data containing the operations for that version).
The server should maintain the following invariants:
The server should maintain the following invariants for each client:
1. Given a client c, c.latestVersion is nil or exists in the set of versions.
1. Given versions v1 and v2 for a client, with v1.versionId != v2.versionId and v1.parentVersionId != nil, v1.parentVersionId != v2.parentVersionId.
1. latestVersionId is nil or exists in the set of versions.
2. Given versions v1 and v2 for a client, with v1.versionId != v2.versionId and v1.parentVersionId != nil, v1.parentVersionId != v2.parentVersionId.
In other words, versions do not branch.
3. If snapshotVersionId is nil, then there is a version with parentVersionId == nil.
4. If snapshotVersionId is not nil, then there is a version with parentVersionId = snapshotVersionId.
Note that versions form a linked list beginning with the version stored in he client.
Note that versions form a linked list beginning with the latestVersionId stored for the client.
This linked list need not continue back to a version with v.parentVersionId = nil.
It may end at any point when v.parentVersionId is not found in the set of Versions.
This observation allows the server to discard older versions.
The third invariant prevents the server from discarding versions if there is no snapshot.
The fourth invariant prevents the server from discarding versions newer than the snapshot.
## Transactions
@ -45,6 +55,7 @@ If it already has one or more versions for the client, then it accepts the versi
If the version is accepted, the server generates a new version ID for it.
The version is added to the set of versions for the client, the client's latest version ID is set to the new version ID.
The new version ID is returned in the response to the client.
The response may also include a request for a snapshot, with associated urgency.
If the version is not accepted, the server makes no changes, but responds to the client with a conflict indication containing the latest version ID.
The client may then "rebase" its operations and try again.
@ -61,7 +72,32 @@ If found, it returns the version's
* parent version ID (matching that in the request), and
* history segment.
If not found, the server returns a negative response.
The response is either a version (success, _not-found_, or _gone_, as determined by the first of the following to apply:
* If a version with parentVersionId equal to the requested parentVersionId exists, it is returned.
* If the requested parentVersionId is the nil UUID ..
* ..and snapshotVersionId is nil, the response is _not-found_ (the client has no versions).
* ..and snapshotVersionId is not nil, the response is _gone_ (the first version has been deleted).
* If a version with versionId equal to the requested parentVersionId exists, the response is _not-found_ (the client is up-to-date)
* Otherwise, the response is _gone_ (the requested version has been deleted).
### AddSnapshot
The AddSnapshot transaction requests that the server store a new snapshot, generated by the client.
The request contains the following:
* version ID at which the snapshot was made
* snapshot data (opaque to the server)
The server should validate that the snapshot is for an existing version and is newer than any existing snapshot.
It may also validate that the snapshot is for a "recent" version (e.g., one of the last 5 versions).
If a snapshot already exists for the given version, the server may keep or discard the new snapshot but should return a success indication to the client.
The server response is empty.
### GetSnapshot
The GetSnapshot transaction requests that the server provide the latest snapshot.
The response contains the snapshot version ID and the snapshot data, if those exist.
## HTTP Representation
@ -79,6 +115,7 @@ The content-type must be `application/vnd.taskchampion.history-segment`.
The success response is a 200 OK with an empty body.
The new version ID appears in the `X-Version-Id` header.
If included, a snapshot request appears in the `X-Snapshot-Request` header with value `urgency=low` or `urgency=high`.
On conflict, the response is a 409 CONFLICT with an empty body.
The expected parent version ID appears in the `X-Parent-Version-Id` header.
@ -88,8 +125,40 @@ Other error responses (4xx or 5xx) may be returned and should be treated appropr
### GetChildVersion
The request is a `GET` to `<origin>/v1/client/get-child-version/<parentVersionId>`.
The response is 404 NOT FOUND if no such version exists.
Otherwise, the response is a 200 OK.
The response is determined as described above.
The _not-found_ response is 404 NOT FOUND.
The _gone_ response is 410 GONE.
Neither has a response body.
On success, the response is a 200 OK.
The version's history segment is returned in the response body, with content-type `application/vnd.taskchampion.history-segment`.
The version ID appears in the `X-Version-Id` header.
The response body may be encoded, in accordance with any `Accept-Encoding` header in the request.
On failure, a client should treat a 404 NOT FOUND as indicating that it is up-to-date.
Clients should treat a 410 GONE as a synchronization error.
If the client has pending changes to send to the server, based on a now-removed version, then those changes cannot be reconciled and will be lost.
The client should, optionally after consulting the user, download and apply the latest snapshot.
### AddSnapshot
The request is a `POST` to `<origin>/v1/client/add-snapshot/<versionId>`.
The request body contains the snapshot data, optionally encoded using any encoding supported by actix-web.
The content-type must be `application/vnd.taskchampion.snapshot`.
If the version is invalid, as described above, the response should be 400 BAD REQUEST.
The server response should be 200 OK on success.
### GetSnapshot
The request is a `GET` to `<origin>/v1/client/snapshot`.
The response is a 200 OK.
The snapshot is returned in the response body, with content-type `application/vnd.taskchampion.snapshot`.
The version ID appears in the `X-Version-Id` header.
The response body may be encoded, in accordance with any `Accept-Encoding` header in the request.
After downloading and decrypting a snapshot, a client must replace its entire local task database with the content of the snapshot.
Any local operations that had not yet been synchronized must be discarded.
After the snapshot is applied, the client should begin the synchronization process again, starting from the snapshot version.