lance
lance copied to clipboard
feat: object store registry for custom object store providers
This change lays the groundwork for LanceDB to support custom implementations of object stores, selected by a custom URL scheme.
Ref: https://discord.com/channels/1030247538198061086/1197630564254101618/1208915798425477181
ACTION NEEDED Lance follows the Conventional Commits specification for release automation.
The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.
For details on the error please inspect the "PR Title Check" action.
Thanks for making a PR. I'm in favor of making the registry extensible. I think a good property of an extensible registry is that the default values are able to be inserted using the extension point. That ensures that third-parties have all the APIs they need to properly extend Lance. I don't think you have to do that refactor immediately in this PR. Instead, I've made some suggestions about some tweaks we would likely need to make to be able to use the registry for our default stores.
Thanks for your comments, @wjones127 !
One thought I had was that the new_store trait member should also be accompanied by a new_commit_handler trait member, which would help make the changes to lance-table neater. However, the CommitHandler trait is defined in lance-table, and the ObjectStoreRegistry is defined in lance-io, so something would have to move (probably CommitHandler to lance-io). Is this an OK move to make as part of this change?
Codecov Report
Attention: Patch coverage is 56.04396% with 40 lines in your changes missing coverage. Please review.
Project coverage is 79.41%. Comparing base (
9535186) to head (8264b10).
Additional details and impacted files
@@ Coverage Diff @@
## main #2513 +/- ##
==========================================
- Coverage 79.44% 79.41% -0.04%
==========================================
Files 221 221
Lines 64683 64744 +61
Branches 64683 64744 +61
==========================================
+ Hits 51388 51415 +27
- Misses 10335 10366 +31
- Partials 2960 2963 +3
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 79.41% <56.04%> (-0.04%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
However, the CommitHandler trait is defined in lance-table, and the ObjectStoreRegistry is defined in lance-io, so something would have to move (probably CommitHandler to lance-io).
Given CommitHandler is a table-level concept, I don't think it should move into the low-level IO library. I think then I would prefer ObjectStoreRegistry be pulled out into lance or lance-table.
There is a commit_handler_from_url function that maybe should be put on the registry.
Given
CommitHandleris a table-level concept, I don't think it should move into the low-level IO library. I think then I would preferObjectStoreRegistrybe pulled out intolanceorlance-table.There is a
commit_handler_from_urlfunction that maybe should be put on the registry.
Would you like me to do this as part of this MR or would you like me to follow up in another MR?
edit: personally I'd be more comfortable doing it in a follow-up MR because I think it would have a big knock-on how the default object stores are constructed; I think it would need moving ObjectStore::from_uri_and_params / ObjectStore::new_from_url out of lance_io and into lance
Given
CommitHandleris a table-level concept, I don't think it should move into the low-level IO library. I think then I would preferObjectStoreRegistrybe pulled out intolanceorlance-table. There is acommit_handler_from_urlfunction that maybe should be put on the registry.Would you like me to do this as part of this MR or would you like me to follow up in another MR?
edit: personally I'd be more comfortable doing it in a follow-up MR because I think it would have a big knock-on how the default object stores are constructed; I think it would need moving
ObjectStore::from_uri_and_params/ObjectStore::new_from_urlout oflance_ioand intolance
You can do that in a follow up PR.
Given
CommitHandleris a table-level concept, I don't think it should move into the low-level IO library. I think then I would preferObjectStoreRegistrybe pulled out intolanceorlance-table. There is acommit_handler_from_urlfunction that maybe should be put on the registry.Would you like me to do this as part of this MR or would you like me to follow up in another MR? edit: personally I'd be more comfortable doing it in a follow-up MR because I think it would have a big knock-on how the default object stores are constructed; I think it would need moving
ObjectStore::from_uri_and_params/ObjectStore::new_from_urlout oflance_ioand intolanceYou can do that in a follow up PR.
Awesome. Is there anything else that needs to be taken care of in this MR?