starbasedb icon indicating copy to clipboard operation
starbasedb copied to clipboard

Add support for multiple instances of databases backed by DurableObject

Open JayJamieson opened this issue 1 year ago • 6 comments

Is your feature request related to a problem? Please describe. Currently only a single database DurableObject instance is support as per the here code and here.

It's not necessarily a bad thing, but it can be limiting in terms of storage capacity and being able to isolate data between users or use cases.

Describe the solution you'd like As I understand from the documentation, It should be possible to create multiple DurableObject instances with separate storage and I/O.

It should be possible to use either idFromName(name: string) or newUniqueId() to create new DurableObject instances with separate storage from one another.

  • idFromName(name: string) could take in a URL parameter from a request to create an instance of a DurableObject. The name would then need to be saved somewhere centrally and linked to as user.
    • the name could/should be prefixed by user identifier to avoid accessing other users DurableObject instances
  • newUniqueId() similarly could be used as an alternative if the provided name is not URL safe

This would mean implementing additional routes for database provisioning, deleting and listing. The existing setup can still be used as a means to provision a "default" DurableObject instance named sql-durable-object to avoid breaking existing functionality.

Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.

Additional context I'm not 100% across how CF works but happy to help on some of the changes if this aligns with starbasedb.

JayJamieson avatar Nov 04 '24 03:11 JayJamieson

What you present here is great and I think unlocks a lot of potential moving forward. The main use case you present here is more around making a single StarbaseDB instance have N number of DO storage instances attached to it, one per user. I think that is one valid approach.

A second valid approach is thinking that a single StarbaseDB instance is already narrowed down "per user" (aka, you spin one Starbase up per user you have) and then the connection of N number of DO storage instances attached to it can be for breaking past the limitations that durable objects have such as the 1GB/10GB data limits. Think more along the lines of data sharding as one example. Or distributed read databases (DO's) and single write instance.

So I guess my thoughts are:

A) Each StarbaseDB instance can manage N number of individual user durable objects attached to it B) Each StarbaseDB is inherently a per user deployment, and then the N number of attached durable objects is for sharding or data replication

Let me know what your thoughts are. Maybe there's a way to make it versatile enough to cover both cases? Maybe it's a choose your own adventure path? Just some raw thoughts here – but overall I think what you're pitching is very much in line with what I've seen people ask for here!

Brayden avatar Nov 04 '24 14:11 Brayden

So I guess my thoughts are:

A) Each StarbaseDB instance can manage N number of individual user durable objects attached to it B) Each StarbaseDB is inherently a per user deployment, and then the N number of attached durable objects is for sharding or data replication

I like this way of thinking, I think this makes a lot of sense and covers a use case I had in mind. If you want I can diagram up the use cases to get a better idea of what code changes to make. Happy to start on a draft PR after things are worked out ;)

JayJamieson avatar Nov 04 '24 23:11 JayJamieson

Diagram away!

Brayden avatar Nov 04 '24 23:11 Brayden

I've put together a sequence diagram of what I understand and two high level diagrams of what yourself and I are thinking about. Let me know if this matches your mental model and if some adjusting needs to be made.

Couple of options for where the lookup information can live, for simplicity I've just used Cloudflare KV bending. I have only covered the most basic usage in from options A and B, each StarbaseDB can manage N number of individual user DO attached to it and each StarbaseDB is inherently per user deployment.

What I haven't worked out is how to tie in individual user durable objects and using Outerbase UI.

sequenceDiagram
    participant C as Worker Router
    participant E as Endpoint /query/raw
    participant A as Auth Worker
    participant K as DATABASE_KV
    participant D as DATABASE_DURABLE_OBJECT
    participant R as userDBStub

    C->>+E: HTTP Request
    E->>+A: Extract userID from token
    A-->>-E: userID
    
    E->>+K: env.DATABASE_KV.get(`${userId}_db`)
    K-->>-E: databaseID
    
    E->>+D: env.DATABASE_DURABLE_OBJECT.idFromName(databaseId)
    D-->>-E: userDBStub
    
    E->>+R: userDBStub.query("select * from sales")
    R-->>-E: query results
    
    E-->>-C: HTTP Response

single_user

multi_user

JayJamieson avatar Nov 06 '24 09:11 JayJamieson

@JayJamieson Apologies for the time its taken for me to respond back to this. I've spent a lot of time trying to internalize the approach and how to move forward. But before I go any further.... wow do you do amazing diagraming work! 😂

Figured instead of assuming on my end I'd just ask a follow up question here. With your proposal do you think that both diagrams could work with your approach or are you weighing going in only one of the two directions? If you think your proposed solution can handle both scenarios then I think I'm all for it.

The bigger concern will come later and not necessarily in this effort, but how to handle treating a collection of Durable Objects as a "single instance" with sharding – but again... not your problem here!

Brayden avatar Nov 12 '24 02:11 Brayden

@Brayden I would imagine some sort of mode that can be configured at deployment time to enable both use cases.

For single tenant, this request would run against the default created DurableObject using the static sql-durable-object name.

curl --location 'https://starbasedb.YOUR-IDENTIFIER.workers.dev/rest/<your_table_name>' \
--header 'Authorization: Bearer ABC-123' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'Content-type=application/json'

To keep things simple lets think in terms of Authorization token/API Key, that way we don't have to think about user registration/management.

  1. For the ABC-321 token and the multi db instance mode is enabled.
  2. The ABC-321 token will be used to perform a lookup for a DurableObject name unique to the ABC-321 from KV storage instead instead of the static sql-durable-object name.
  3. If no database path parameter provided, then this is the ABC-321 default database. Equivalent to using static sql-durable-object for DurableObject name.
curl --location 'https://starbasedb.YOUR-IDENTIFIER.workers.dev/rest/<your_table_name>' \
--header 'Authorization: Bearer ABC-321' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'Content-type=application/json'

Additionally if a path parameter is provided:

  1. then we do a lookup for same DurableObject name for the token ABC-321.
  2. append <your_db_name> as either a prefix or postfix to the name
  3. use that string to get a DurableObject reference.
curl --location 'https://starbasedb.YOUR-IDENTIFIER.workers.dev/rest/<your_db_name>/<your_table_name>' \
--header 'Authorization: Bearer ABC-321' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--data-urlencode 'Content-type=application/json'

Maybe some sort of method resolveDurableObjectName(config, token). Checks config for deployment mode, and handles doing lookups based on that as necessary.

JayJamieson avatar Nov 13 '24 10:11 JayJamieson