xo-sync-server-hono

Move SSE to be a middleware on the HTTP-Router. We want to do this so we can completely abstract away the "subscribing" from the routes themselves. We could also do this by putting a "subscribe()" method on the RouteRequest object, but im undecided on whether this is a better approach

Handling subscribers

Each "user" is going to be subscribing to many "storageIdentifiers". Having a Record<storageIdentifer, Subscribers[]> would eat up a bunch of memory because of the multiple duplications of Subscribers. Instead, we will maintain an Array of storageIdentifiers: string[], and we will index those identifiers in a separate Record<Subscriber, Record<storageIdentifier, boolean>>.

When we want to broadcast an event, we iterate through the storageIdentifers array and find the index. We then iterate over all the subscribers and check if subscriber[storageIdentifier] === true. Its O(N) where N is the number of subscribers. There may be a faster way to do this in practice, as a key lookup is slower than iterating over an array, but its a relatively simple approach, IMO. Better ideas are welcome, though.

As for maintaining the Array of storageIdentifiers, its probably simplest just to prune it on a routine. We could also look at doing something similar to Redis, where it creates a second instance of the data and swaps between them with the more up-to-date version. Basically, its so they can multi-thread read vs write (I think. I haven't looked too much into it, just briefly saw something about redis maintaining a shadow version of the cache)

Endpoints

Read from storage

(All will return an SSE response if the accept is event/text-stream)

GET /get?id=string?bloomFilter=string

GET /get?items=string[] | { id: string, bloomFilter: string }[]

POST /get
{
  items: string | string[] | { id: string, bloomFilter: string }[]
}

Appending to a storage

Storages are append-only, you can not set or delete a storage

POST /set/:id
{
  value: string
}

Syncing

To avoid syncing the entire event history every time it connects, we allow for inverse-bloom-filters to be passed into the endpoints on reads, but do create a inverse-bloom-filter, there are negotiation steps that need to be followed to ensure both parties are computing valid inverse-bloom-filters

--- Inverse Bloom Filters will give false-negatives for items, meaning it may return false when an item does exist in the list, rather than a tranditional bloom filter which may return true on an item that is not in the list ---

To allow for negotiation, we provide

GET /bloom?bloomFilter=string & count=number

POST /bloom
{
  items: {
    bloomFilter: string
    count: number
  }
}

Because the larger number must be used for the bloom filter, we assert the client should always compute a bloom filter, but if the server has one with a larger number of events, it will compute its own and then return that new count + filter to the client.

The response from the request will just be a list of hashes of encrypted data. The list is of items which the client (or server?) are missing.