tantivy
tantivy copied to clipboard
iOS compilation failure due to zstd
I'm working on lockbook, a cross platform, secure note taking app. We recently found our build broken while updating our rust version and traced this behavior to zstd. This failure is described here: https://github.com/gyscos/zstd-rs/issues/337
I do see, however that zstd is an optional feature here
https://github.com/quickwit-oss/tantivy/blob/5379c99ea2a4639b659f640d9fd2c43030929161/Cargo.toml#L31
but is used normally here https://github.com/quickwit-oss/tantivy/blob/5379c99ea2a4639b659f640d9fd2c43030929161/sstable/Cargo.toml#L19
Is it the intention for zstd to be truly optional? If so, may I send a PR to make sstable reflect this? This would allow us to remove zstd from our dependency tree and proceed with our upgrade.
Or is it the case that this feature is truly not optional and does need to be solved with zstd itself?
Relevant cargo tree output copied from the zstd issue:
zstd v0.13.3
└── tantivy-sstable v0.3.0
└── tantivy-columnar v0.3.0
└── tantivy v0.22.0
└── lb-rs v0.9.22 (/Users/parth/Documents/lockbook/lockbook/libs/lb/lb-rs)
├── lb-c v0.9.22 (/Users/parth/Documents/lockbook/lockbook/libs/lb/lb-c)
│ └── workspace-ffi v0.9.22 (/Users/parth/Documents/lockbook/lockbook/libs/content/workspace-ffi)
└── workspace v0.9.22 (/Users/parth/Documents/lockbook/lockbook/libs/content/workspace)
└── workspace-ffi v0.9.22 (/Users/parth/Documents/lockbook/lockbook/libs/content/workspace-ffi)
I think zstd could be optional in sstable and should be straightforward to add in a PR. cc @trinity-1686a
sstable's codec is written in such a way that compression is already optional (enabled/disabled based on if it's helping or not with size). It should be rather easy to disable zstd entirely from sstables. Obviously sstables written with zstd enabled would not always be readable by a software built without zstd, but a software with zstd support would be able to read both and wouldn't notice a thing.
https://github.com/quickwit-oss/tantivy/blob/main/sstable/src/delta.rs#L60 is the 1st place you'll see zstd, if you make if block_len > 2048 { into an unconditional false, no zstd code can be reached at encoding time.
https://github.com/quickwit-oss/tantivy/blob/main/sstable/src/block_reader.rs#L88 is the 2nd place you'll see some zstd, making that entire branch return an error (while style verifying if what you get is a compressed block) seems the way to go
Thanks for the insight @trinity-1686a, starting to whip up a PR but want to clarify:
Obviously sstables written with zstd enabled would not always be readable by a software built without zstd
Does this interfere with the guarantee expressed in the changelog here? As part of my PR should I remove the compatibility guarantee and add more details about the change?
That should be okay, if you keep the the serialized format compatible.
i guess this is an odd position. Currently the support for zstd is disabled by default in tantivy, but mandatory in sstable on which tantivy depends. Piggybacking on tantivy feature flag would break things for people not enabling zstd (their old index wouldn't be entirely readable without enabling that feature), but having multiple feature flags is going to confuse people, and rightly so. So we probably need to either put a disclaimer in the patchnote, or enable zstd-compression by default in tantivy
Thanks for the guidance @PSeitz / @trinity-1686a. Looking forward to your feedback ^
Thank you for merging, @PSeitz!
Do you know roughly when we can expect a point release for this change?
Hey guys, last checkin here @PSeitz / @trinity-1686a on when the next expected release is before we explore pointing our repository to our own fork via git =.
The next release is not yet planned, you could just point to the current commit in the main branch in tantivy.
@PSeitz we can't do that for a few reasons:
- tantivy-columnar and other workspace dependencies also need to be published
- we wouldn't be able to publish our crate with
git =
I believe if you guys don't publish our only course of action would be to fork all workspace members and also publish them to crates.
Given how far apart some of the versions on crates are I'm having trouble determining if it's on the order of weeks that we're waiting or months. Any indication here would be very useful for us.
Can you make an PR to update Changelog.md for the new version with all the changes? This is usually what costs most of the time for a new release
I use git cliff as a helper usually (it doesn't include everything sometimes though)
Hey @PSeitz,
grabbed all the commits via git cliff and formatted them based on prior changelog entries, lmk if there's any problems and I'll hop on them.
Hopefully this can enable a ~~0.24.2~~ 0.25.0 release as soon as possible.
Hi @PSeitz,
is bumping version (like this: https://github.com/quickwit-oss/tantivy/pull/2625/files) the next blocker for release?
lmk if you'd like me to send you a PR
for anyone following: I see tantivy 0.24.2 was released, which is close to what we need, but sstable & columnar were not released. For whatever reason I'm not able to patch.crates-io for these dependencies, no matter what I try.
So our last resort if we want to keep using tantivy would be to publish our own copies to crates.
0.24.2 was a bugfix release. The fix for iOS is a breaking release and will be part of 0.25.0
The bugfix had priority over 0.25, but I'll try to make a tantivy release this week, which will include the fixed iOS compilation
tantivy 0.25.0 has been released