[Bug]: PersistentClient.delete_collection does not delete persitent folders when uvicorn server is restarted
What happened?
I have a fastapi app that checks for changes of knowledge bases in a config file when the server is restarted. For those removed configurations, I want to delete the related collections as well. But folders ./chroma/{uuid} can not be deleted after the uvicorn server is restarted. I wrote a simple demo to reproduce the problem as discussed in #1245.
Test demo:
import os
import chromadb
from chromadb.config import Settings
from fastapi import FastAPI
print("chromadb.__version__", chromadb.__version__)
def get_folder_size(start_path: str) -> float:
total_size = 0
for dirpath, dirnames, filenames in os.walk(start_path):
for f in filenames:
fp = os.path.join(dirpath, f)
# skip if it is symbolic link
if not os.path.islink(fp):
total_size += os.path.getsize(fp)
return total_size / (1024 * 1024) # convert bytes to megabytes
app = FastAPI()
@app.get("/upsert")
async def upsert():
"""create a collection and upsert docs"""
script_entry = get_folder_size("./chroma")
print("on script run", script_entry)
client = chromadb.PersistentClient(settings=Settings(allow_reset=True))
client.reset()
after_reset = get_folder_size("./chroma")
print("after_reset", after_reset)
collection = client.get_or_create_collection("fruit")
collection.upsert(
documents=["apples", "oranges", "bananas", "pineapples"], ids=["1", "2", "3", "4"]
)
# print(collection.query(query_texts=["hawaii"], n_results=1))
# get the size of the folder called ./chroma
before_size = get_folder_size("./chroma")
print("before", before_size)
@app.get("/delete")
async def delete():
"""delete a collection"""
client = chromadb.PersistentClient(settings=Settings(allow_reset=True))
client.delete_collection("fruit")
after_size = get_folder_size("./chroma")
print("after", after_size)
# difference
# print("diff", before_size - after_size)
client.reset()
after_reset_end = get_folder_size("./chroma")
print("after_reset end", after_reset_end)
I started the server with uvicorn main:app --port 8901 --host 0.0.0.0 and called localhost:8901/upsert, the collection was created successfully and the size of ./chroma changed from 0.140625 to 1.7428932189941406:
chromadb.__version__ 0.4.15
INFO: Started server process [7336]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8901 (Press CTRL+C to quit)
on script run 0.0
after_reset 0.140625
before 1.7428932189941406
INFO: 127.0.0.1:55650 - "GET /upsert HTTP/1.1" 200 OK
Then the server was shutdown by ctrl + c and restarted:
INFO: Shutting down
INFO: Waiting for application shutdown.
INFO: Application shutdown complete.
INFO: Finished server process [7336]
(chroma) E:\PycharmProjects\Test>uvicorn main:app --port 8901 --host 0.0.0.0
chromadb.__version__ 0.4.15
INFO: Started server process [9124]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8901 (Press CTRL+C to quit)
Finally, I called localhost:8901/delete and found that the size of ./chroma did not change (still 1.7428932189941406) because the folder of collection fruit was not deleted:
after 1.7428932189941406
after_reset end 1.7428932189941406
INFO: 127.0.0.1:56013 - "GET /delete HTTP/1.1" 200 OK
Could anyone help me solve this or provide alternative solutions?
Versions
chromadb 0.4.15, python 3.10, windows 10
Relevant log output
No response