chroma
chroma copied to clipboard
[Feature Request]: Configure path for `.chroma/index' for DuckDB when used without persistence
Describe the problem
I'm running an app on AWS lambda with chroma and want to use DuckDB without persistence. Lambda does not allow writes to any directory other than /tmp. Failure stacktrace below.
[ERROR] 2023-05-07T14:21:19.196Z f701e6be-197b-4134-91df-e64c75502077 Exception in 'http' protocol.
Traceback (most recent call last):
File "/tmp/sls-py-req/mangum/protocols/http.py", line 97, in run
await app(self.scope, self.receive, self.send)
File "/tmp/sls-py-req/fastapi/applications.py", line 270, in __call__
await super().__call__(scope, receive, send)
File "/tmp/sls-py-req/starlette/applications.py", line 124, in __call__
await self.middleware_stack(scope, receive, send)
File "/tmp/sls-py-req/starlette/middleware/errors.py", line 184, in __call__
raise exc
File "/tmp/sls-py-req/starlette/middleware/errors.py", line 162, in __call__
await self.app(scope, receive, _send)
File "/tmp/sls-py-req/starlette/middleware/exceptions.py", line 75, in __call__
raise exc
File "/tmp/sls-py-req/starlette/middleware/exceptions.py", line 64, in __call__
await self.app(scope, receive, sender)
File "/tmp/sls-py-req/fastapi/middleware/asyncexitstack.py", line 21, in __call__
raise e
File "/tmp/sls-py-req/fastapi/middleware/asyncexitstack.py", line 18, in __call__
await self.app(scope, receive, send)
File "/tmp/sls-py-req/starlette/routing.py", line 680, in __call__
await route.handle(scope, receive, send)
File "/tmp/sls-py-req/starlette/routing.py", line 275, in handle
await self.app(scope, receive, send)
File "/tmp/sls-py-req/starlette/routing.py", line 65, in app
response = await func(request)
File "/tmp/sls-py-req/fastapi/routing.py", line 231, in app
raw_response = await run_endpoint_function(
File "/tmp/sls-py-req/fastapi/routing.py", line 162, in run_endpoint_function
return await run_in_threadpool(dependant.call, **values)
File "/tmp/sls-py-req/starlette/concurrency.py", line 41, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "/tmp/sls-py-req/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/tmp/sls-py-req/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/tmp/sls-py-req/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/var/task/src/main.py", line 69, in comprehendPolicy
index = getQueryIndex(saveSiteText(policy.policy_link))
File "/var/task/src/main.py", line 58, in getQueryIndex
return VectorstoreIndexCreator().from_loaders([loader])
File "/tmp/sls-py-req/langchain/indexes/vectorstore.py", line 73, in from_loaders
return self.from_documents(docs)
File "/tmp/sls-py-req/langchain/indexes/vectorstore.py", line 78, in from_documents
vectorstore = self.vectorstore_cls.from_documents(
File "/tmp/sls-py-req/langchain/vectorstores/chroma.py", line 412, in from_documents
return cls.from_texts(
File "/tmp/sls-py-req/langchain/vectorstores/chroma.py", line 380, in from_texts
chroma_collection.add_texts(texts=texts, metadatas=metadatas, ids=ids)
File "/tmp/sls-py-req/langchain/vectorstores/chroma.py", line 159, in add_texts
self._collection.add(
File "/tmp/sls-py-req/chromadb/api/models/Collection.py", line 111, in add
self._client._add(ids, self.name, embeddings, metadatas, documents, increment_index)
File "/tmp/sls-py-req/chromadb/api/local.py", line 140, in _add
self._db.add_incremental(collection_uuid, added_uuids, embeddings)
File "/tmp/sls-py-req/chromadb/db/clickhouse.py", line 542, in add_incremental
index.add(uuids, embeddings)
File "/tmp/sls-py-req/chromadb/db/index/hnswlib.py", line 124, in add
self._init_index(dim)
File "/tmp/sls-py-req/chromadb/db/index/hnswlib.py", line 107, in _init_index
self._save()
File "/tmp/sls-py-req/chromadb/db/index/hnswlib.py", line 178, in _save
os.makedirs(f"{self._save_folder}")
File "/var/lang/lib/python3.8/os.py", line 213, in makedirs
makedirs(head, exist_ok=exist_ok)
File "/var/lang/lib/python3.8/os.py", line 223, in makedirs
mkdir(name, mode)
OSError: [Errno 30] Read-only file system: '.chroma'
Describe the proposed solution
Could we get a way to specify the base directory path to create .chroma in?
Alternatives considered
I've switched to using chroma with persistence (specifying persist_directory as /tmp) to avoid the .chroma creation. I'm able to use the lambda now. But, I don't need persistence for my case and this extra I/O might be an overhead.
Importance
would make my life easier
Additional Information
No response