spicedb Implement a YugaByteDB datastore

trafficstars

Creating this issue to gauge interest.

Feb 23 '22 20:02 jzelinskie

@jzelinskie Seems almost 10 person could be interested in the Yugabyte Datastore.

As a test I try to run the postgresql test against YugabyteDB but face two issue :

ALTER TABLE namespace_config ADD COLUMN id BIGSERIAL PRIMARY KEY failed because YDB do not support adding another primary key yet because the sharding is based on the primary key but i guess that could be solve with a new migration script.

Once fixing the previous issue I am getting the following error as well

?       github.com/authzed/spicedb/internal/datastore/postgres/common   [no test files]
?       github.com/authzed/spicedb/internal/datastore/postgres/migrations       [no test files]
?       github.com/authzed/spicedb/internal/datastore/postgres/version  [no test files]
--- FAIL: TestPostgresDatastoreWithoutCommitTimestamps (0.00s)
    --- FAIL: TestPostgresDatastoreWithoutCommitTimestamps/postgres-13.8 (23.89s)
        postgres.go:113: 
                Error Trace:    /Users/gwenn/work/github/spicedb/internal/testserver/datastore/postgres.go:113
                                                        /Users/gwenn/work/github/spicedb/internal/datastore/postgres/postgres_test.go:200
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/datastore.go:33
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/namespace.go:38
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/datastore.go:72
                Error:          Received unexpected error:
                                unable to compute head revision: multiple or zero head revisions found: [add-rel-by-alive-resource-relation-subject add-unique-datastore-id]
                Test:           TestPostgresDatastoreWithoutCommitTimestamps/postgres-13.8
        --- FAIL: TestPostgresDatastoreWithoutCommitTimestamps/postgres-13.8/TestNamespaceNotFound (5.08s)
            testing.go:1490: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test
--- FAIL: TestPostgresDatastore (0.00s)
    --- FAIL: TestPostgresDatastore/postgres-13.8-head- (26.66s)
        postgres.go:113: 
                Error Trace:    /Users/gwenn/work/github/spicedb/internal/testserver/datastore/postgres.go:113
                                                        /Users/gwenn/work/github/spicedb/internal/datastore/postgres/postgres_test.go:75
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/datastore.go:33
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/namespace.go:38
                                                        /Users/gwenn/work/github/spicedb/pkg/datastore/test/datastore.go:72
                Error:          Received unexpected error:
                                unable to compute head revision: multiple or zero head revisions found: [add-rel-by-alive-resource-relation-subject add-unique-datastore-id]
                Test:           TestPostgresDatastore/postgres-13.8-head-
        --- FAIL: TestPostgresDatastore/postgres-13.8-head-/TestNamespaceNotFound (5.28s)
            testing.go:1490: test executed panic(nil) or runtime.Goexit: subtest may have called FailNow on a parent test
FAIL
FAIL    github.com/authzed/spicedb/internal/datastore/postgres  27.739s
FAIL
Error: running "go ./internal/datastore/postgres/..." failed with exit code 1
exit status 1

For that on I am not sure what the issue is.

I think creating a Ydb datastore should be that hard, as the compatibility with PG is high.

Oct 25 '23 08:10 shinji62

@shinji62 We don't think implementing Yugabyte would be difficult, however, adding a new datastore needs to be carefully analyzed. Each datastore has its own quirks and limitations, and evolving SpiceDB with N datastores underlying it adds considerable engineering and maintenance overhead. Just deciphering the mysteries of each database query planner involves a non-trivial amount of time, making sure performance does not regress as the service on top evolves, migrations, testing with multiple versions, accounting for the specifics of each datastore client configuration, tunning...

It's not difficult, but it does not come for free.

Oct 25 '23 08:10 vroldanbet

@vroldanbet First an apology, I wasn't trying to under estimate the work, I fully agree with what you say, if does not come for free, for the "client configuration / tunning and so on", I would be more than happy to help.

Thanks for the quick answer.

Oct 25 '23 08:10 shinji62

@shinji62 no need to apologize! It was a good opportunity to shed some light on what it takes to add a new datastore. We mantainers haven't done a good job at clarifying what it takes to support a new database technology in SpiceDB.

We are certainly keeping an eye on database technologies that align with SpiceDB requirements (strong consistency, global distribution, horizontal scalability and exposed MVCC semantics). I'm not familiar with how Yugabyte supports these, do you have some insight?

Oct 25 '23 09:10 vroldanbet

@vroldanbet let me respond to your question

Overall YugabyteDB is a Distributed SQL database which provide 2 API one been a Postgresql compatible (almost 100%, we use the psql codebase) and one cassandra compatible in top of a distributed storage which handle automatic sharding, distributed transaction and so on.., Ydb is a CP database with strong HA.

strong consistency: This is one of the core, are we are CP Db we do have strong consistency and transactional support for the Postgresql API and the Cassandra one.

global distribution: Not sure what you mean by global distribution, but we do have geo replication and geo distribution, for example pinning certain data to region for compliance for example.

horizontal scalability: Ydb is share nothing and distributed DB so scale-out is one of the core feature as well

exposed MVCC semantics: Not sure what you mean by that.

Oct 26 '23 01:10 shinji62

@shinji62

strong consistency: This is one of the core, are we are CP Db we do have strong consistency and transactional support for the Postgresql API and the Cassandra one.

Sounds good. Just to make sure we are talking about the same thing because those terms tend to be overloaded: I'm referring to strict serializability isolation level and external consistency. I make emphasis because SpiceDB requires this, and for example, CockroachDB, being an open-source implementation of Spanner, has some caveats on their isolation/consistency guarantees that are problematic for SpiceDB.

global distribution: Not sure what you mean by global distribution, but we do have geo replication and geo distribution, for example pinning certain data to region for compliance for example.

My apologies for using such a vague term. Geo replication and geo distribution does not necessarily describe with precision how the system behaves in light of writes distributed across the globe.

SpiceDB requires a database capable of providing the strongest levels of isolation and consistency when distributed globally. It means providing the same strong guarantees but with multiple reads and writes distributed worldwide. Single-node architectures like Postgres/MySQL are out of the equation here, and if my recollection of Cassandra is correct, reaching consensus of N nodes distributed around the world will likely be very slow, but I'm not sure that's how YugaByte works. Not that Spanner and CockroachDB are blazing fast in this regard, but they are designed with that use-case in mind.

exposed MVCC semantics: Not sure what you mean by that.

The database should be able to return query results at a given snapshot. This is fundamental to SpiceDB's bounded staleness, which is the trick to make it scale. While this is not a _hard requirement, the bookkeeping needed to layer a snapshotting system on top of the database is additional overhead.

Oct 26 '23 09:10 vroldanbet

Thanks @vroldanbet

Yugabyte support the same isolation as PG. Now for the external consistency or I guess Linearizability we do support for single-row but for not for multi-row transactions which support three isolation levels: Serializable, Snapshot (also known as repeatable read), and Read Committed isolation.

Reaching consensus of N nodes distributed around the world will likely be very slow, but I'm not sure that's how YugaByte works. Not that Spanner and CockroachDB are blazing fast in this regard, but they are designed with that use-case in mind.

I think that quite similar to cockroach or spanner, mostly depends on the latency between the node.

Oct 30 '23 08:10 shinji62

Thanks for the info @shinji62 🙏🏻

Oct 30 '23 09:10 vroldanbet

Would also like to see this supported. We were looking to use Authzed with Yugabyte as the database provider.

Jan 29 '24 20:01 jsco2t

spicedb spicedb copied to clipboard

Implement a YugaByteDB datastore

spicedb
spicedb copied to clipboard