lance icon indicating copy to clipboard operation
lance copied to clipboard

feat: add initial experimental support for blobs

Open westonpace opened this issue 1 year ago • 1 comments

Blobs are still an experimental feature. You can try them out by using the experimental writer. However, files that are created in this way will not be supported in the future (you will not be able to open the blobs) and so this is only recommended for experiments and POC.

Any column that is large_binary will be written as blobs. Blobs are written outside of the normal stream of data (in this PR they are still part of the data file. In future PRs they will be in separate files entirely). This helps blobs avoid compaction.

When a table with blobs is read back the blob column will contain an array of "blob descriptions (a struct array of path/position/size today and in the future it will be fragment id/blob file id/position/size).

A new dataset method open_blobs can be used to exchange an array of blob descriptions for an array blob objects. These blob objects can be used as file objects.

westonpace avatar Apr 19 '24 19:04 westonpace

Codecov Report

Attention: Patch coverage is 65.66456% with 217 lines in your changes are missing coverage. Please review.

Project coverage is 80.76%. Comparing base (b39e8e8) to head (a02f227).

Files Patch % Lines
rust/lance/src/dataset/blob.rs 63.59% 117 Missing and 61 partials :warning:
rust/lance/src/dataset/write.rs 57.14% 20 Missing and 1 partial :warning:
rust/lance-table/src/format/fragment.rs 61.11% 6 Missing and 1 partial :warning:
rust/lance/src/dataset.rs 88.88% 3 Missing and 1 partial :warning:
rust/lance-core/src/datatypes/field.rs 0.00% 3 Missing :warning:
rust/lance-core/src/utils/testing.rs 76.92% 1 Missing and 2 partials :warning:
rust/lance/src/dataset/fragment/write.rs 75.00% 0 Missing and 1 partial :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2232      +/-   ##
==========================================
- Coverage   80.90%   80.76%   -0.15%     
==========================================
  Files         190      191       +1     
  Lines       55520    56144     +624     
  Branches    55520    56144     +624     
==========================================
+ Hits        44920    45345     +425     
- Misses       8086     8221     +135     
- Partials     2514     2578      +64     
Flag Coverage Δ
unittests 80.76% <65.66%> (-0.15%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Apr 19 '24 23:04 codecov-commenter

@wjones127 thanks for the initial review. This PR has undergone quite a bit of transformation since then. I've moved the blob writing out of the v2 writer and into a dedicated "blob writer". Also, blobs are no longer part of the file format but a part of the table format. I've added "blob files" as a field in fragments.

There is not yet support for cleanup of blobs and I am thinking this will be quite tricky (and would like to address in a different PR).

The challenge will be knowing when blob data has become irrelevant. This is not too bad if entire fragments are deleted. However, partial fragment deletion is trickier and still a very common case. For example, if row 50 is deleted from a fragment and we eventually materialize that deletion then we will need to be able to eventually remove that range of data from the blob file.

Some possible approaches:

  • If a fragment compaction materializes any deletions then we write a trimmed blob file at this time
    • This is what I am leaning towards but it makes compaction slower if there are deletions (which will hopefully not be super common)
  • If a fragment compaction materializes any deletions then we write a "deleted ranges" file. We can later materialize these deleted ranges as part of "blob compaction" or "cleanup"
  • At blob cleanup time we do a full scan of the blob description columns (in all fragments that reference the blob, including older versions) to determine which ranges are no longer needed (slow cleanup process but no involvement at all in compaction)

westonpace avatar Apr 29 '24 14:04 westonpace

Also, since this PR does not address compaction / cleanup I currently have blobs hidden behind an environment variable. Also, if we key off of "large_binary" to detect blobs then this will be a breaking change (since we support large_binary currently) and so we probably want it hidden behind an environment variable (with a futures warning) for a few releases.

westonpace avatar Apr 29 '24 14:04 westonpace

I'm closing this for now. After some deliberation I think it would better to do this feature once we have some kind of notion of a primary key.

westonpace avatar May 06 '24 14:05 westonpace