Chroma.fromExistingCollection shows "collection exists" error on server side
I just created a test app and I used Chroma.fromDocuments to index some documents. I'm running a Chroma server in Docker, using the docker-compose setup in the Chroma repository.
Now I'm calling Chroma.fromExistingCollection on subsequent runs, and I notice that the Chroma server shows an error each time. The complete error output is included below. The gist seems to be that the collection already exists -- which I would expect when I call fromExistingCollection!
Important to say that the test query I'm running works fine. So I assume I'm generally doing things the right way. For your reference, here's the code I'm using, reduced to the seemingly important bits. I hope somebody will tell me that I'm simply using this wrong.
Here I initialize from documents:
const loader = new DirectoryLoader( ... );
const docs = await loader.load();
const vectorStore = await Chroma.fromDocuments(
docs,
new OpenAIEmbeddings(),
{
collectionName: 'oli-test',
url: 'http://localhost:8000'
}
);
const response = await vectorStore.similaritySearch('convertmd', 2);
Here I use fromExistingCollection:
const vectorStore = await Chroma.fromExistingCollection(
new OpenAIEmbeddings(),
{
collectionName: 'oli-test',
url: 'http://localhost:8000'
}
);
const response = await vectorStore.similaritySearch('convertmd', 2);
This is the console output from the server.
2023-04-12 20:18:09 chroma-server-1 | 2023-04-12 19:18:09 ERROR chromadb.server.fastapi Collection with name oli-test already exists
2023-04-12 20:18:09 chroma-server-1 | Traceback (most recent call last):
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/anyio/streams/memory.py", line 94, in receive
2023-04-12 20:18:09 chroma-server-1 | return self.receive_nowait()
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/anyio/streams/memory.py", line 89, in receive_nowait
2023-04-12 20:18:09 chroma-server-1 | raise WouldBlock
2023-04-12 20:18:09 chroma-server-1 | anyio.WouldBlock
2023-04-12 20:18:09 chroma-server-1 |
2023-04-12 20:18:09 chroma-server-1 | During handling of the above exception, another exception occurred:
2023-04-12 20:18:09 chroma-server-1 |
2023-04-12 20:18:09 chroma-server-1 | Traceback (most recent call last):
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/starlette/middleware/base.py", line 43, in call_next
2023-04-12 20:18:09 chroma-server-1 | message = await recv_stream.receive()
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/anyio/streams/memory.py", line 114, in receive
2023-04-12 20:18:09 chroma-server-1 | raise EndOfStream
2023-04-12 20:18:09 chroma-server-1 | anyio.EndOfStream
2023-04-12 20:18:09 chroma-server-1 |
2023-04-12 20:18:09 chroma-server-1 | During handling of the above exception, another exception occurred:
2023-04-12 20:18:09 chroma-server-1 |
2023-04-12 20:18:09 chroma-server-1 | Traceback (most recent call last):
2023-04-12 20:18:09 chroma-server-1 | File "/chroma/./chromadb/server/fastapi/__init__.py", line 47, in catch_exceptions_middleware
2023-04-12 20:18:09 chroma-server-1 | return await call_next(request)
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/starlette/middleware/base.py", line 46, in call_next
2023-04-12 20:18:09 chroma-server-1 | raise app_exc
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/starlette/middleware/base.py", line 36, in coro
2023-04-12 20:18:09 chroma-server-1 | await self.app(scope, request.receive, send_stream.send)
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 75, in __call__
2023-04-12 20:18:09 chroma-server-1 | raise exc
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 64, in __call__
2023-04-12 20:18:09 chroma-server-1 | await self.app(scope, receive, sender)
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 21, in __call__
2023-04-12 20:18:09 chroma-server-1 | raise e
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/fastapi/middleware/asyncexitstack.py", line 18, in __call__
2023-04-12 20:18:09 chroma-server-1 | await self.app(scope, receive, send)
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 680, in __call__
2023-04-12 20:18:09 chroma-server-1 | await route.handle(scope, receive, send)
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 275, in handle
2023-04-12 20:18:09 chroma-server-1 | await self.app(scope, receive, send)
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/starlette/routing.py", line 65, in app
2023-04-12 20:18:09 chroma-server-1 | response = await func(request)
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 231, in app
2023-04-12 20:18:09 chroma-server-1 | raw_response = await run_endpoint_function(
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/fastapi/routing.py", line 162, in run_endpoint_function
2023-04-12 20:18:09 chroma-server-1 | return await run_in_threadpool(dependant.call, **values)
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/starlette/concurrency.py", line 41, in run_in_threadpool
2023-04-12 20:18:09 chroma-server-1 | return await anyio.to_thread.run_sync(func, *args)
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
2023-04-12 20:18:09 chroma-server-1 | return await get_asynclib().run_sync_in_worker_thread(
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
2023-04-12 20:18:09 chroma-server-1 | return await future
2023-04-12 20:18:09 chroma-server-1 | File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
2023-04-12 20:18:09 chroma-server-1 | result = context.run(func, *args)
2023-04-12 20:18:09 chroma-server-1 | File "/chroma/./chromadb/server/fastapi/__init__.py", line 137, in create_collection
2023-04-12 20:18:09 chroma-server-1 | return self._api.create_collection(
2023-04-12 20:18:09 chroma-server-1 | File "/chroma/./chromadb/api/local.py", line 69, in create_collection
2023-04-12 20:18:09 chroma-server-1 | res = self._db.create_collection(name, metadata, get_or_create)
2023-04-12 20:18:09 chroma-server-1 | File "/chroma/./chromadb/db/clickhouse.py", line 151, in create_collection
2023-04-12 20:18:09 chroma-server-1 | raise ValueError(f"Collection with name {name} already exists")
2023-04-12 20:18:09 chroma-server-1 | ValueError: Collection with name oli-test already exists
2023-04-12 20:18:09 chroma-server-1 | 2023-04-12 19:18:09 INFO uvicorn.access 172.27.0.1:49886 - "POST /api/v1/collections HTTP/1.1" 500
2023-04-12 20:18:10 chroma-server-1 | 2023-04-12 19:18:10 DEBUG chromadb.db.index.hnswlib time to pre process our knn query: 1.1920928955078125e-06
2023-04-12 20:18:10 chroma-server-1 | 2023-04-12 19:18:10 DEBUG chromadb.db.index.hnswlib time to run knn query: 0.00010442733764648438
2023-04-12 20:18:10 chroma-server-1 | 2023-04-12 19:18:10 INFO uvicorn.access 172.27.0.1:49896 - "POST /api/v1/collections/oli-test/query HTTP/1.1" 200
Having the same error . I reported the issue at mayo pdf sample, because I thought it was something on that particular implementation. (https://github.com/mayooear/gpt4-pdf-chatbot-langchain/discussions/217)
Did you had any progress?
This was a test implementation and I ignored the error for now. No other progress.
Hi, @oliversturm! I'm here to help the LangChain team manage their backlog and I wanted to let you know that we are marking this issue as stale.
From what I understand, you reported an issue where calling Chroma.fromExistingCollection results in an error stating that the collection already exists, which is unexpected. Another user, @0x6f677548, also reported the same error in a different repository. You mentioned that this was a test implementation and no further progress has been made.
Could you please let us know if this issue is still relevant to the latest version of the LangChain repository? If it is, please comment on this issue to let the LangChain team know. Otherwise, feel free to close the issue yourself or it will be automatically closed in 7 days.
Thank you for your understanding and contribution to the LangChain project!