foundationdb icon indicating copy to clipboard operation
foundationdb copied to clipboard

[EXPERIMENTAL] Improved Go tuple library

Open Semisol opened this issue 3 months ago • 2 comments

This PR adds an experimental tuple decoder that is optimized for lower allocations and higher performance, support for fixed-length byte strings, along with fixing some panics.

This should be kept experimental until a way to handle the potentially breaking changes is found (possibly combining it with other breaking binding changes in a Go bindings "v2")

Over the original decoder, the new decoder attempts to zero-copy strings, and uses a new Boxed type which requires zero allocations compared to the any type which requires its contents are on the heap.
A neat hack relating to on-stack slice allocation is also used to remove allocations from append that arise from tuples while not having to estimate the length of the tuple for capacity.

The tests are also improved to ensure unpacking works correctly, and benchmarks are added.

This PR was closed and reopened due to separating this update to a separate branch

Benchmarking

To run benchmarks: go test -run=NONE -bench=^BenchmarkTupleUnpack

To-do list

  • [ ] Improve code documentation
  • [ ] Add fuzz tests that were used to discover panics that should have been errors when unpacking tuples
  • [ ] Handle the breaking change to google/uuid (we may be able to just alias tuple.UUID to it)
  • [ ] Make the V2 decoder the primary decoder, and remove the original decoder's code before merge.
  • [ ] Handle the fact that some users may expect Unpack to not depend on the input buffer. This could be "fixed" by copying the input buffer or by making a breaking v2 release (alongside possibly other binding improvements).
  • [ ] Add support for big.Int. The new decoder does not support this right now.
  • [ ] Add support for end-of-tuple type encoding and returning bytes after EOT
  • [ ] Consider adding support for 64-bit identifier type

Code-Reviewer Section

The general pull request guidelines can be found here.

Please check each of the following things and check all boxes before accepting a PR.

  • [x] The PR has a description, explaining both the problem and the solution.
  • [x] The description mentions which forms of testing were done and the testing seems reasonable.
  • [x] Every function/class/actor that was touched is reasonably well documented.

Semisol avatar Aug 28 '25 22:08 Semisol

Thanks for the PR. Curious what is breaking here? Is it the API or something else?

vishesh avatar Sep 04 '25 20:09 vishesh

Thanks for the PR. Curious what is breaking here? Is it the API or something else?

UnpackV2 (which will become Unpack) assumes that the input byte slice will not become deallocated (if it for example points to memory not allocated by Go) or changed (as it tries to zero-copy strings).

The other breaking change is to switch from the tuple.UUID type to google/uuid, as it seems to be the most common convention.

I am currently maintaining an internal fork of the Go bindings to see how it can be improved, as it has a lot of rough edges. I am intending to contribute back the results once it is more stable, ideally as one large break.

Semisol avatar Nov 28 '25 19:11 Semisol