chroma icon indicating copy to clipboard operation
chroma copied to clipboard

[BUG]: Deleting unloaded collection

Open tazarov opened this issue 10 months ago • 10 comments

Ref: #1309

Description of changes

Summarize the changes made by this PR.

  • Improvements & Bug fixes
    • Fixes an issue where collection data dir is only removed if the collection is first loaded into memory. We now use hint_use_collection to load the collection's segments before removal.

Test plan

How are these changes tested?

  • [x] Tests pass locally with pytest for python

Documentation Changes

N/A

tazarov avatar Oct 30 '23 20:10 tazarov

Reviewer Checklist

Please leverage this checklist to ensure your code review is thorough before approving

Testing, Bugs, Errors, Logs, Documentation

  • [ ] Can you think of any use case in which the code does not behave as intended? Have they been tested?
  • [ ] Can you think of any inputs or external events that could break the code? Is user input validated and safe? Have they been tested?
  • [ ] If appropriate, are there adequate property based tests?
  • [ ] If appropriate, are there adequate unit tests?
  • [ ] Should any logging, debugging, tracing information be added or removed?
  • [ ] Are error messages user-friendly?
  • [ ] Have all documentation changes needed been made?
  • [ ] Have all non-obvious changes been commented?

System Compatibility

  • [ ] Are there any potential impacts on other parts of the system or backward compatibility?
  • [ ] Does this change intersect with any items on our roadmap, and if so, is there a plan for fitting them together?

Quality

  • [ ] Is this code of a unexpectedly high quality (Readbility, Modularity, Intuitiveness)

github-actions[bot] avatar Oct 30 '23 20:10 github-actions[bot]

OMG, so many checks 😅

Nayjest avatar Oct 31 '23 18:10 Nayjest

@Nayjest we take code quality very seriously! I think it has to be that way for shipping mission-critical software in open-source.

jeffchuber avatar Nov 01 '23 14:11 jeffchuber

We could do this by changing local_persistent_hnsw to expose a delete that doesn't need to load the segment.

HammadB avatar Nov 01 '23 18:11 HammadB

@HammadB, some random grpc server failure, can you rerun the test and if it fails again I'll keep digging.

tazarov avatar Nov 02 '23 09:11 tazarov

Discussed offline: we can statically get_storage_folder and then delete if not loaded in the local segment_manager

HammadB avatar Nov 29 '23 18:11 HammadB

@HammadB, I think this can be merged now.

tazarov avatar Jan 25 '24 12:01 tazarov

Thanks! I hope that will help, coz I have now alomost terabyte DB on production that works crazily slow and actually stores less than 1KB of vectorized text ))

Nayjest avatar Jan 25 '24 14:01 Nayjest

@HammadB since you approved, can this now be merged?

prevosti avatar Apr 13 '24 22:04 prevosti

Can this issue be resolved please?

rv3183 avatar Apr 20 '24 11:04 rv3183