spicedb
spicedb copied to clipboard
Ensure the deterministic serialized expression for the caveats
Problem Statement
I just started to try the spiceDB (with PG as the datastore) in my project. While submitting the WriteSchema requests with the same caveat, I found that most of the time the spiceDB will insert new caveat records into the PG even if nothing has been changed regarding the definitions of the caveat in the requests.
After tracing the related implementations, I found that such the inconsistency is caused by the none-deterministic marshaling behavior of the vtprotobuf (as mentioned here in the issue: https://github.com/planetscale/vtprotobuf/issues/82).
Solution Brainstorm
Since the develop group of the vtprotobuf doesn't seem to be active about resolving the marshaling issue, may be we could utilize some simple script to make the adjustment on the generated results, for instance
https://github.com/benjamin99/spicedb/commit/e2040b70bacfe8f0e08ae4f228be9f4f49a34dfe
@benjamin99 We could in theory switch off of the VTMarshal for the caveat definitions to the standard, but strict, proto version.
Can you construct a test exercising this issue and add it to your branch?
Hi @josephschorr sorry for the late reply. Here is my quick adaptation from the original test cases to ensure that the serialization of the compiled caveats are deterministic. https://github.com/benjamin99/spicedb/commit/2acb1f89ffd49aebb59cf714381d675289b255d0. Plz let me know if there are some changes needed to be made 🙏
@benjamin99 Looks good. Can you also add a test into the caveat diff tests here: https://github.com/authzed/spicedb/blob/main/pkg/diff/caveats/diff_test.go
I think we should be able to fix this problem by just switching the caveat deserialization to either be strict, or to "renormalize" the caveats before we compare them, which would ensure that the diff lib does not report them as changed
@josephschorr just updated the test case for "no changes" in the diff package https://github.com/authzed/spicedb/commit/c7a6c4bd6cb2214b47c29fc99dae7fc5ef94be41
Yes, I did think about "renormalize"/"reconstruct" the caveat itself from the serialized expression before making the comparison. But after checking the whole process, I think it's kinda awkward to first translate the CompiledCaveats to the CaveatDefinitions in the translate function, and then try to reconstruct the cel info (which was contained by the CompiledCaveats in the first place) before making comparison in the caveat diff function. Base on this point of view, I think changing the behavior of the serializations to make it more strict/stable would be a more feasible choice?
@benjamin99 Actually, looking at the diff lib, we even have the diff named right now as being a maybe: https://github.com/authzed/spicedb/blob/f42617656e03a0f6c760f602f2df3906e4483664/pkg/diff/caveats/diff.go#L155
So we could, if the bytes are different, deserialize strict and compare; it would just make the diff a bit slower in that case
@josephschorr but base on my observation, it looks like the serialization is so inconsistent that it will get the different results from the same caveat like nine times out of ten. So if we deserialize the caveat definitions iff the there exists some byte difference in the serialization expression, it actually means that we will have to run the deserialization process for almost each and one of the comparison requests?
@benjamin99 Yeah, probably, which is why we didn't do so in the first place... but maybe its fast enough that it doesn't matter? We'd have to benchmark and see
Okey I'll try to give it a try, but it might take some days since it is now the holiday season for the lunar new year here in Taiwan.
@benjamin99 Happy Lunar New Year! (apologies if that's not the correct way to say so)
And no rush - I'll see if I can run some benchmarks as well
@benjamin99 Checking in