mirror of
https://github.com/GothenburgBitFactory/taskwarrior.git
synced 2025-06-26 10:54:26 +02:00
Add support for cloud sync, specifically GCP (#3223)
* Add support for cloud sync, specifically GCP This adds generic support for sync to cloud services, with specific spuport for GCP. Adding others -- so long as they support a compare-and-set operation -- should be comparatively straightforward. The cloud support includes cleanup of unnecessary data, and should keep total space usage roughly proportional to the number of tasks. Co-authored-by: ryneeverett <ryneeverett@gmail.com>
This commit is contained in:
parent
6f1c16fecd
commit
9566c929e2
36 changed files with 4012 additions and 401 deletions
|
@ -11,4 +11,7 @@
|
|||
* [Synchronization Model](./sync-model.md)
|
||||
* [Snapshots](./snapshots.md)
|
||||
* [Server-Replica Protocol](./sync-protocol.md)
|
||||
* [Encryption](./encryption.md)
|
||||
* [HTTP Implementation](./http.md)
|
||||
* [Object-Store Implementation](./object-store.md)
|
||||
* [Planned Functionality](./plans.md)
|
||||
|
|
38
taskchampion/docs/src/encryption.md
Normal file
38
taskchampion/docs/src/encryption.md
Normal file
|
@ -0,0 +1,38 @@
|
|||
# Encryption
|
||||
|
||||
The client configuration includes an encryption secret of arbitrary length.
|
||||
This section describes how that information is used to encrypt and decrypt data sent to the server (versions and snapshots).
|
||||
|
||||
Encryption is not used for local (on-disk) sync, but is used for all cases where data is sent from the local host.
|
||||
|
||||
## Key Derivation
|
||||
|
||||
The client derives the 32-byte encryption key from the configured encryption secret using PBKDF2 with HMAC-SHA256 and 100,000 iterations.
|
||||
The salt value depends on the implementation of the protocol, as described in subsequent chapters.
|
||||
|
||||
## Encryption
|
||||
|
||||
The client uses [AEAD](https://commondatastorage.googleapis.com/chromium-boringssl-docs/aead.h.html), with algorithm CHACHA20_POLY1305.
|
||||
The client should generate a random nonce, noting that AEAD is _not secure_ if a nonce is used repeatedly for the same key.
|
||||
|
||||
AEAD supports additional authenticated data (AAD) which must be provided for both open and seal operations.
|
||||
In this protocol, the AAD is always 17 bytes of the form:
|
||||
* `app_id` (byte) - always 1
|
||||
* `version_id` (16 bytes) - 16-byte form of the version ID associated with this data
|
||||
* for versions (AddVersion, GetChildVersion), the _parent_ version_id
|
||||
* for snapshots (AddSnapshot, GetSnapshot), the snapshot version_id
|
||||
|
||||
The `app_id` field is for future expansion to handle other, non-task data using this protocol.
|
||||
Including it in the AAD ensures that such data cannot be confused with task data.
|
||||
|
||||
Although the AEAD specification distinguishes ciphertext and tags, for purposes of this specification they are considered concatenated into a single bytestring as in BoringSSL's `EVP_AEAD_CTX_seal`.
|
||||
|
||||
## Representation
|
||||
|
||||
The final byte-stream is comprised of the following structure:
|
||||
|
||||
* `version` (byte) - format version (always 1)
|
||||
* `nonce` (12 bytes) - encryption nonce
|
||||
* `ciphertext` (remaining bytes) - ciphertext from sealing operation
|
||||
|
||||
The `version` field identifies this data format, and future formats will have a value other than 1 in this position.
|
65
taskchampion/docs/src/http.md
Normal file
65
taskchampion/docs/src/http.md
Normal file
|
@ -0,0 +1,65 @@
|
|||
# HTTP Representation
|
||||
|
||||
The transactions in the sync protocol are realized for an HTTP server at `<origin>` using the HTTP requests and responses described here.
|
||||
The `origin` *should* be an HTTPS endpoint on general principle, but nothing in the functonality or security of the protocol depends on connection encryption.
|
||||
|
||||
The replica identifies itself to the server using a `client_id` in the form of a UUID.
|
||||
This value is passed with every request in the `X-Client-Id` header, in its dashed-hex format.
|
||||
|
||||
The salt used in key derivation is the SHA256 hash of the 16-byte form of the client ID.
|
||||
|
||||
## AddVersion
|
||||
|
||||
The request is a `POST` to `<origin>/v1/client/add-version/<parentVersionId>`.
|
||||
The request body contains the history segment, optionally encoded using any encoding supported by actix-web.
|
||||
The content-type must be `application/vnd.taskchampion.history-segment`.
|
||||
|
||||
The success response is a 200 OK with an empty body.
|
||||
The new version ID appears in the `X-Version-Id` header.
|
||||
If included, a snapshot request appears in the `X-Snapshot-Request` header with value `urgency=low` or `urgency=high`.
|
||||
|
||||
On conflict, the response is a 409 CONFLICT with an empty body.
|
||||
The expected parent version ID appears in the `X-Parent-Version-Id` header.
|
||||
|
||||
Other error responses (4xx or 5xx) may be returned and should be treated appropriately to their meanings in the HTTP specification.
|
||||
|
||||
## GetChildVersion
|
||||
|
||||
The request is a `GET` to `<origin>/v1/client/get-child-version/<parentVersionId>`.
|
||||
|
||||
The response is determined as described above.
|
||||
The _not-found_ response is 404 NOT FOUND.
|
||||
The _gone_ response is 410 GONE.
|
||||
Neither has a response body.
|
||||
|
||||
On success, the response is a 200 OK.
|
||||
The version's history segment is returned in the response body, with content-type `application/vnd.taskchampion.history-segment`.
|
||||
The version ID appears in the `X-Version-Id` header.
|
||||
The response body may be encoded, in accordance with any `Accept-Encoding` header in the request.
|
||||
|
||||
On failure, a client should treat a 404 NOT FOUND as indicating that it is up-to-date.
|
||||
Clients should treat a 410 GONE as a synchronization error.
|
||||
If the client has pending changes to send to the server, based on a now-removed version, then those changes cannot be reconciled and will be lost.
|
||||
The client should, optionally after consulting the user, download and apply the latest snapshot.
|
||||
|
||||
## AddSnapshot
|
||||
|
||||
The request is a `POST` to `<origin>/v1/client/add-snapshot/<versionId>`.
|
||||
The request body contains the snapshot data, optionally encoded using any encoding supported by actix-web.
|
||||
The content-type must be `application/vnd.taskchampion.snapshot`.
|
||||
|
||||
If the version is invalid, as described above, the response should be 400 BAD REQUEST.
|
||||
The server response should be 200 OK on success.
|
||||
|
||||
## GetSnapshot
|
||||
|
||||
The request is a `GET` to `<origin>/v1/client/snapshot`.
|
||||
|
||||
The response is a 200 OK.
|
||||
The snapshot is returned in the response body, with content-type `application/vnd.taskchampion.snapshot`.
|
||||
The version ID appears in the `X-Version-Id` header.
|
||||
The response body may be encoded, in accordance with any `Accept-Encoding` header in the request.
|
||||
|
||||
After downloading and decrypting a snapshot, a client must replace its entire local task database with the content of the snapshot.
|
||||
Any local operations that had not yet been synchronized must be discarded.
|
||||
After the snapshot is applied, the client should begin the synchronization process again, starting from the snapshot version.
|
9
taskchampion/docs/src/object-store.md
Normal file
9
taskchampion/docs/src/object-store.md
Normal file
|
@ -0,0 +1,9 @@
|
|||
# Object Store Representation
|
||||
|
||||
TaskChampion also supports use of a generic key-value store to synchronize replicas.
|
||||
|
||||
In this case, the salt used in key derivation is a random 16-byte value, stored
|
||||
in the object store and retrieved as needed.
|
||||
|
||||
The details of the mapping from this protocol to keys and values are private to the implementation.
|
||||
Other applications should not access the key-value store directly.
|
|
@ -2,7 +2,7 @@
|
|||
|
||||
The basic synchronization model described in the previous page has a few shortcomings:
|
||||
* servers must store an ever-increasing quantity of versions
|
||||
* a new replica must download all versions since the beginning in order to derive the current state
|
||||
* a new replica must download all versions since the beginning (the nil UUID) in order to derive the current state
|
||||
|
||||
Snapshots allow TaskChampion to avoid both of these issues.
|
||||
A snapshot is a copy of the task database at a specific version.
|
||||
|
@ -37,12 +37,3 @@ This saves resources in these restricted environments.
|
|||
|
||||
A snapshot must be made on a replica with no unsynchronized operations.
|
||||
As such, it only makes sense to request a snapshot in response to a successful AddVersion request.
|
||||
|
||||
## Handling Deleted Versions
|
||||
|
||||
When a replica requests a child version, the response must distinguish two cases:
|
||||
|
||||
1. No such child version exists because the replica is up-to-date.
|
||||
1. No such child version exists because it has been deleted, and the replica must re-initialize itself.
|
||||
|
||||
The details of this logic are covered in the [Server-Replica Protocol](./sync-protocol.md).
|
||||
|
|
|
@ -32,7 +32,10 @@ For those familiar with distributed version control systems, a state is analogou
|
|||
Fundamentally, synchronization involves all replicas agreeing on a single, linear sequence of operations and the state that those operations create.
|
||||
Since the replicas are not connected, each may have additional operations that have been applied locally, but which have not yet been agreed on.
|
||||
The synchronization process uses operational transformation to "linearize" those operations.
|
||||
|
||||
This process is analogous (vaguely) to rebasing a sequence of Git commits.
|
||||
Critically, though, operations cannot merge; in effect, the only option is rebasing.
|
||||
Furthermore, once an operation has been sent to the server it cannot be changed; in effect, the server does not permit "force push".
|
||||
|
||||
### Sync Operations
|
||||
|
||||
|
@ -135,4 +138,4 @@ Without synchronization, its list of pending operations would grow indefinitely,
|
|||
So all replicas, even "singleton" replicas which do not replicate task data with any other replica, must synchronize periodically.
|
||||
|
||||
TaskChampion provides a `LocalServer` for this purpose.
|
||||
It implements the `get_child_version` and `add_version` operations as described, storing data on-disk locally, all within the `ta` binary.
|
||||
It implements the `get_child_version` and `add_version` operations as described, storing data on-disk locally.
|
||||
|
|
|
@ -1,91 +1,42 @@
|
|||
# Server-Replica Protocol
|
||||
|
||||
The server-replica protocol is defined abstractly in terms of request/response transactions from the replica to the server.
|
||||
This is made concrete in an HTTP representation.
|
||||
The server-replica protocol is defined abstractly in terms of request/response transactions.
|
||||
|
||||
The protocol builds on the model presented in the previous chapter, and in particular on the synchronization process.
|
||||
The protocol builds on the model presented in the previous chapters, and in particular on the synchronization process.
|
||||
|
||||
## Clients
|
||||
|
||||
From the server's perspective, replicas accessing the same task history are indistinguishable, so this protocol uses the term "client" to refer generically to all replicas replicating a single task history.
|
||||
|
||||
Each client is identified and authenticated with a "client_id key", known only to the server and to the replicas replicating the task history.
|
||||
From the protocol's perspective, replicas accessing the same task history are indistinguishable, so this protocol uses the term "client" to refer generically to all replicas replicating a single task history.
|
||||
|
||||
## Server
|
||||
|
||||
A server implements the requests and responses described below.
|
||||
Where the logic is implemented depends on the specific implementation of the protocol.
|
||||
|
||||
For each client, the server is responsible for storing the task history, in the form of a branch-free sequence of versions.
|
||||
It also stores the latest snapshot, if any exists.
|
||||
From the server's perspective, snapshots and versions are opaque byte sequences.
|
||||
|
||||
* versions: a set of {versionId: UUID, parentVersionId: UUID, historySegment: bytes}
|
||||
* latestVersionId: UUID
|
||||
* snapshotVersionId: UUID
|
||||
* snapshot: bytes
|
||||
## Version Invariant
|
||||
|
||||
For each client, it stores a set of versions as well as the latest version ID, defaulting to the nil UUID.
|
||||
Each version has a version ID, a parent version ID, and a history segment (opaque data containing the operations for that version).
|
||||
The server should maintain the following invariants for each client:
|
||||
The following invariant must always hold:
|
||||
|
||||
1. latestVersionId is nil or exists in the set of versions.
|
||||
2. Given versions v1 and v2 for a client, with v1.versionId != v2.versionId and v1.parentVersionId != nil, v1.parentVersionId != v2.parentVersionId.
|
||||
In other words, versions do not branch.
|
||||
3. If snapshotVersionId is nil, then there is a version with parentVersionId == nil.
|
||||
4. If snapshotVersionId is not nil, then there is a version with parentVersionId = snapshotVersionId.
|
||||
|
||||
Note that versions form a linked list beginning with the latestVersionId stored for the client.
|
||||
This linked list need not continue back to a version with v.parentVersionId = nil.
|
||||
It may end at any point when v.parentVersionId is not found in the set of Versions.
|
||||
This observation allows the server to discard older versions.
|
||||
The third invariant prevents the server from discarding versions if there is no snapshot.
|
||||
The fourth invariant prevents the server from discarding versions newer than the snapshot.
|
||||
> All versions are linked by parent-child relationships to form a single chain.
|
||||
> That is, each version must have no more than one parent and one child, and no more than one version may have zero parents or zero children.
|
||||
|
||||
## Data Formats
|
||||
|
||||
### Encryption
|
||||
|
||||
The client configuration includes an encryption secret of arbitrary length and a clientId to identify itself.
|
||||
This section describes how that information is used to encrypt and decrypt data sent to the server (versions and snapshots).
|
||||
|
||||
#### Key Derivation
|
||||
|
||||
The client derives the 32-byte encryption key from the configured encryption secret using PBKDF2 with HMAC-SHA256 and 100,000 iterations.
|
||||
The salt is the SHA256 hash of the 16-byte form of the client ID.
|
||||
|
||||
#### Encryption
|
||||
|
||||
The client uses [AEAD](https://commondatastorage.googleapis.com/chromium-boringssl-docs/aead.h.html), with algorithm CHACHA20_POLY1305.
|
||||
The client should generate a random nonce, noting that AEAD is _not secure_ if a nonce is used repeatedly for the same key.
|
||||
|
||||
AEAD supports additional authenticated data (AAD) which must be provided for both open and seal operations.
|
||||
In this protocol, the AAD is always 17 bytes of the form:
|
||||
* `app_id` (byte) - always 1
|
||||
* `version_id` (16 bytes) - 16-byte form of the version ID associated with this data
|
||||
* for versions (AddVersion, GetChildVersion), the _parent_ version_id
|
||||
* for snapshots (AddSnapshot, GetSnapshot), the snapshot version_id
|
||||
|
||||
The `app_id` field is for future expansion to handle other, non-task data using this protocol.
|
||||
Including it in the AAD ensures that such data cannot be confused with task data.
|
||||
|
||||
Although the AEAD specification distinguishes ciphertext and tags, for purposes of this specification they are considered concatenated into a single bytestring as in BoringSSL's `EVP_AEAD_CTX_seal`.
|
||||
|
||||
#### Representation
|
||||
|
||||
The final byte-stream is comprised of the following structure:
|
||||
|
||||
* `version` (byte) - format version (always 1)
|
||||
* `nonce` (12 bytes) - encryption nonce
|
||||
* `ciphertext` (remaining bytes) - ciphertext from sealing operation
|
||||
|
||||
The `version` field identifies this data format, and future formats will have a value other than 1 in this position.
|
||||
Task data sent to the server is encrypted by the client, using the scheme described in the "Encryption" chapter.
|
||||
|
||||
### Version
|
||||
|
||||
The decrypted form of a version is a JSON array containing operations in the order they should be applied.
|
||||
Each operation has the form `{TYPE: DATA}`, for example:
|
||||
|
||||
* `{"Create":{"uuid":"56e0be07-c61f-494c-a54c-bdcfdd52d2a7"}}`
|
||||
* `{"Delete":{"uuid":"56e0be07-c61f-494c-a54c-bdcfdd52d2a7"}}`
|
||||
* `{"Update":{"uuid":"56e0be07-c61f-494c-a54c-bdcfdd52d2a7","property":"prop","value":"v","timestamp":"2021-10-11T12:47:07.188090948Z"}}`
|
||||
* `{"Update":{"uuid":"56e0be07-c61f-494c-a54c-bdcfdd52d2a7","property":"prop","value":null,"timestamp":"2021-10-11T12:47:07.188090948Z"}}` (to delete a property)
|
||||
* `[{"Create":{"uuid":"56e0be07-c61f-494c-a54c-bdcfdd52d2a7"}}]`
|
||||
* `[{"Delete":{"uuid":"56e0be07-c61f-494c-a54c-bdcfdd52d2a7"}}]`
|
||||
* `[{"Update":{"uuid":"56e0be07-c61f-494c-a54c-bdcfdd52d2a7","property":"prop","value":"v","timestamp":"2021-10-11T12:47:07.188090948Z"}}]`
|
||||
* `[{"Update":{"uuid":"56e0be07-c61f-494c-a54c-bdcfdd52d2a7","property":"prop","value":null,"timestamp":"2021-10-11T12:47:07.188090948Z"}}]` (to delete a property)
|
||||
|
||||
Timestamps are in RFC3339 format with a `Z` suffix.
|
||||
|
||||
|
@ -108,24 +59,25 @@ For example (pretty-printed for clarity):
|
|||
|
||||
## Transactions
|
||||
|
||||
All interactions between the client and server are defined in terms of request/response transactions, as described here.
|
||||
|
||||
### AddVersion
|
||||
|
||||
The AddVersion transaction requests that the server add a new version to the client's task history.
|
||||
The request contains the following;
|
||||
|
||||
* parent version ID
|
||||
* history segment
|
||||
* parent version ID, and
|
||||
* encrypted version data.
|
||||
|
||||
The server determines whether the new version is acceptable, atomically with respect to other requests for the same client.
|
||||
If it has no versions for the client, it accepts the version.
|
||||
If it already has one or more versions for the client, then it accepts the version only if the given parent version ID matches its stored latest parent ID.
|
||||
If it already has one or more versions for the client, then it accepts the version only if the given parent version has no children, thereby maintaining the version invariant.
|
||||
|
||||
If the version is accepted, the server generates a new version ID for it.
|
||||
The version is added to the set of versions for the client, the client's latest version ID is set to the new version ID.
|
||||
The new version ID is returned in the response to the client.
|
||||
The version is added to the chain of versions for the client, and the new version ID is returned in the response to the client.
|
||||
The response may also include a request for a snapshot, with associated urgency.
|
||||
|
||||
If the version is not accepted, the server makes no changes, but responds to the client with a conflict indication containing the latest version ID.
|
||||
If the version is not accepted, the server makes no changes, but responds to the client with a conflict indication containing the ID of the version which has no children.
|
||||
The client may then "rebase" its operations and try again.
|
||||
Note that if a client receives two conflict responses with the same parent version ID, it is an indication that the client's version history has diverged from that on the server.
|
||||
|
||||
|
@ -138,23 +90,17 @@ If found, it returns the version's
|
|||
|
||||
* version ID,
|
||||
* parent version ID (matching that in the request), and
|
||||
* history segment.
|
||||
* encrypted version data.
|
||||
|
||||
The response is either a version (success, _not-found_, or _gone_, as determined by the first of the following to apply:
|
||||
* If a version with parentVersionId equal to the requested parentVersionId exists, it is returned.
|
||||
* If the requested parentVersionId is the nil UUID ..
|
||||
* ..and snapshotVersionId is nil, the response is _not-found_ (the client has no versions).
|
||||
* ..and snapshotVersionId is not nil, the response is _gone_ (the first version has been deleted).
|
||||
* If a version with versionId equal to the requested parentVersionId exists, the response is _not-found_ (the client is up-to-date)
|
||||
* Otherwise, the response is _gone_ (the requested version has been deleted).
|
||||
If not found, it returns an indication that no such version exists.
|
||||
|
||||
### AddSnapshot
|
||||
|
||||
The AddSnapshot transaction requests that the server store a new snapshot, generated by the client.
|
||||
The request contains the following:
|
||||
|
||||
* version ID at which the snapshot was made
|
||||
* snapshot data (opaque to the server)
|
||||
* version ID at which the snapshot was made, and
|
||||
* encrypted snapshot data.
|
||||
|
||||
The server should validate that the snapshot is for an existing version and is newer than any existing snapshot.
|
||||
It may also validate that the snapshot is for a "recent" version (e.g., one of the last 5 versions).
|
||||
|
@ -167,66 +113,3 @@ The server response is empty.
|
|||
The GetSnapshot transaction requests that the server provide the latest snapshot.
|
||||
The response contains the snapshot version ID and the snapshot data, if those exist.
|
||||
|
||||
## HTTP Representation
|
||||
|
||||
The transactions above are realized for an HTTP server at `<origin>` using the HTTP requests and responses described here.
|
||||
The `origin` *should* be an HTTPS endpoint on general principle, but nothing in the functonality or security of the protocol depends on connection encryption.
|
||||
|
||||
The replica identifies itself to the server using a `client_id` in the form of a UUID.
|
||||
This value is passed with every request in the `X-Client-Id` header, in its dashed-hex format.
|
||||
|
||||
### AddVersion
|
||||
|
||||
The request is a `POST` to `<origin>/v1/client/add-version/<parentVersionId>`.
|
||||
The request body contains the history segment, optionally encoded using any encoding supported by actix-web.
|
||||
The content-type must be `application/vnd.taskchampion.history-segment`.
|
||||
|
||||
The success response is a 200 OK with an empty body.
|
||||
The new version ID appears in the `X-Version-Id` header.
|
||||
If included, a snapshot request appears in the `X-Snapshot-Request` header with value `urgency=low` or `urgency=high`.
|
||||
|
||||
On conflict, the response is a 409 CONFLICT with an empty body.
|
||||
The expected parent version ID appears in the `X-Parent-Version-Id` header.
|
||||
|
||||
Other error responses (4xx or 5xx) may be returned and should be treated appropriately to their meanings in the HTTP specification.
|
||||
|
||||
### GetChildVersion
|
||||
|
||||
The request is a `GET` to `<origin>/v1/client/get-child-version/<parentVersionId>`.
|
||||
|
||||
The response is determined as described above.
|
||||
The _not-found_ response is 404 NOT FOUND.
|
||||
The _gone_ response is 410 GONE.
|
||||
Neither has a response body.
|
||||
|
||||
On success, the response is a 200 OK.
|
||||
The version's history segment is returned in the response body, with content-type `application/vnd.taskchampion.history-segment`.
|
||||
The version ID appears in the `X-Version-Id` header.
|
||||
The response body may be encoded, in accordance with any `Accept-Encoding` header in the request.
|
||||
|
||||
On failure, a client should treat a 404 NOT FOUND as indicating that it is up-to-date.
|
||||
Clients should treat a 410 GONE as a synchronization error.
|
||||
If the client has pending changes to send to the server, based on a now-removed version, then those changes cannot be reconciled and will be lost.
|
||||
The client should, optionally after consulting the user, download and apply the latest snapshot.
|
||||
|
||||
### AddSnapshot
|
||||
|
||||
The request is a `POST` to `<origin>/v1/client/add-snapshot/<versionId>`.
|
||||
The request body contains the snapshot data, optionally encoded using any encoding supported by actix-web.
|
||||
The content-type must be `application/vnd.taskchampion.snapshot`.
|
||||
|
||||
If the version is invalid, as described above, the response should be 400 BAD REQUEST.
|
||||
The server response should be 200 OK on success.
|
||||
|
||||
### GetSnapshot
|
||||
|
||||
The request is a `GET` to `<origin>/v1/client/snapshot`.
|
||||
|
||||
The response is a 200 OK.
|
||||
The snapshot is returned in the response body, with content-type `application/vnd.taskchampion.snapshot`.
|
||||
The version ID appears in the `X-Version-Id` header.
|
||||
The response body may be encoded, in accordance with any `Accept-Encoding` header in the request.
|
||||
|
||||
After downloading and decrypting a snapshot, a client must replace its entire local task database with the content of the snapshot.
|
||||
Any local operations that had not yet been synchronized must be discarded.
|
||||
After the snapshot is applied, the client should begin the synchronization process again, starting from the snapshot version.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue