chroma
chroma copied to clipboard
Chromadb getting locked.
What happened?
I have indexed 2101951 chunks into chromadb. Each chunk is 512 tokens. Embeddings used: tiktoken
Below is code that is used for indexing. There is only one process that is doing indexing which is ongoing process with below code.
def get_or_create_collection(self, client, collection_name):
"""
Attempts to retrieve a collection. If it does not exist, creates a new one.
:param client: The database client instance.
:param collection_name: The name of the collection to retrieve or create.
:return: The retrieved or newly created collection.
:raises: Re-raises unexpected exceptions.
"""
try:
return client.get_collection(collection_name)
except Exception as e:
if "Collection" in str(e) and "does not exist" in str(e):
logging.warning("Collection %s not found. Creating a new one.", collection_name)
return client.create_collection(collection_name, metadata={"hnsw:space": "cosine"})
else:
logging.error("Unexpected error while getting collection %s: %s", collection_name, str(e))
raise # Re-raise the unexpected exception
collection = self.get_or_create_collection(client, collection_name)
collection.add(embeddings=embeddings, ids=ids, metadatas=meta_info, documents=chunks)
There is another process which has exposed rest api over few use cases. They simple do query over the collection on which this document is indexed.
Collection name - BusinessFocus_document Tenant - BusinessFocus Database - default_database
The issue that I am facing rite now is that chromadb is getting deadlock (which is more of a sqllite issue it seems). I am running with default chroma setup in a docker. Below are the logs
INFO: [20-03-2025 11:12:33] 172.18.0.1:42200 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:12:34] 172.18.0.1:42200 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:12:34] 172.18.0.1:42200 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:12:35] 172.18.0.1:42200 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:12:35] 172.18.0.1:42200 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:12:43] 172.18.0.1:37054 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:12:43] 172.18.0.1:37054 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:12:49] 172.18.0.1:37068 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:12:49] 172.18.0.1:37068 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:12:51] 172.18.0.1:37068 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:12:51] 172.18.0.1:37068 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:12:57] 172.18.0.1:51582 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:12:57] 172.18.0.1:51582 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:13:05] 172.18.0.1:42864 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:13:05] 172.18.0.1:42864 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:13:12] 172.18.0.1:37588 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:13:12] 172.18.0.1:37588 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:17:19] 172.18.0.1:59490 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:17:19] 172.18.0.1:59490 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:17:27] 172.18.0.1:52980 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:17:27] 172.18.0.1:52980 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:17:42] 172.18.0.1:47672 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:17:42] 172.18.0.1:47672 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:17:46] 172.18.0.1:47672 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:17:47] 172.18.0.1:47672 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:17:53] 172.18.0.1:57436 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
INFO: [20-03-2025 11:17:53] 172.18.0.1:57436 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 201
INFO: [20-03-2025 11:18:00] 172.18.0.1:38730 - "GET /api/v2/tenants/Pharma_Regulatory HTTP/1.1" 200
INFO: [20-03-2025 11:18:00] 172.18.0.1:38742 - "GET /api/v2/auth/identity HTTP/1.1" 200
INFO: [20-03-2025 11:18:00] 172.18.0.1:38746 - "GET /api/v2/tenants/Pharma_Regulatory HTTP/1.1" 200
INFO: [20-03-2025 11:18:00] 172.18.0.1:38746 - "GET /api/v2/tenants/Pharma_Regulatory/databases/default_database HTTP/1.1" 200
INFO: [20-03-2025 11:18:00] 172.18.0.1:38742 - "GET /api/v2/tenants/Pharma_Regulatory/databases/default_database/collections/Pharma_Regulatory_document HTTP/1.1" 200
INFO: [20-03-2025 11:22:27] 172.18.0.1:44220 - "GET /api/v2/tenants/BusinessFocus/databases/default_database/collections/BusinessFocus_document HTTP/1.1" 200
ERROR: [20-03-2025 11:41:10] database is locked
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/anyio/streams/memory.py", line 111, in receive
return self.receive_nowait()
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/streams/memory.py", line 106, in receive_nowait
raise WouldBlock
anyio.WouldBlock
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/anyio/streams/memory.py", line 124, in receive
return receiver.item
^^^^^^^^^^^^^
AttributeError: 'MemoryObjectItemReceiver' object has no attribute 'item'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 157, in call_next
message = await recv_stream.receive()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/streams/memory.py", line 126, in receive
raise EndOfStream
anyio.EndOfStream
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/chroma/chromadb/server/fastapi/__init__.py", line 107, in catch_exceptions_middleware
return await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 163, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 149, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 185, in __call__
with collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 82, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 187, in __call__
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/server/fastapi/__init__.py", line 131, in check_http_version_middleware
return await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 163, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 149, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 735, in app
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 73, in app
response = await f(request)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 301, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/telemetry/opentelemetry/__init__.py", line 134, in async_wrapper
return await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/server/fastapi/__init__.py", line 807, in add
await to_thread.run_sync(
File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 943, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/server/fastapi/__init__.py", line 789, in process_add
return self._api._add(
^^^^^^^^^^^^^^^
File "/chroma/chromadb/telemetry/opentelemetry/__init__.py", line 150, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/api/segment.py", line 103, in wrapper
return self._rate_limit_enforcer.rate_limit(func)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/rate_limit/simple_rate_limit/__init__.py", line 23, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/api/segment.py", line 437, in _add
self._producer.submit_embeddings(collection_id, records_to_submit)
File "/chroma/chromadb/telemetry/opentelemetry/__init__.py", line 150, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/db/mixins/embeddings_queue.py", line 238, in submit_embeddings
with self.tx() as cur:
File "/chroma/chromadb/db/impl/sqlite.py", line 55, in __exit__
self._conn.commit()
File "/chroma/chromadb/db/impl/sqlite_pool.py", line 33, in commit
self._conn.commit()
sqlite3.OperationalError: database is locked
INFO: [20-03-2025 11:41:11] 172.18.0.1:44220 - "POST /api/v2/tenants/BusinessFocus/databases/default_database/collections/34bd3d87-4674-4a6b-a304-6fee07e0ba81/add HTTP/1.1" 500
ERROR: [20-03-2025 11:41:11] cannot start a transaction within a transaction
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/anyio/streams/memory.py", line 111, in receive
return self.receive_nowait()
^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/streams/memory.py", line 106, in receive_nowait
raise WouldBlock
anyio.WouldBlock
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/anyio/streams/memory.py", line 124, in receive
return receiver.item
^^^^^^^^^^^^^
AttributeError: 'MemoryObjectItemReceiver' object has no attribute 'item'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 157, in call_next
message = await recv_stream.receive()
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/streams/memory.py", line 126, in receive
raise EndOfStream
anyio.EndOfStream
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/chroma/chromadb/server/fastapi/__init__.py", line 107, in catch_exceptions_middleware
return await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 163, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 149, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 185, in __call__
with collapse_excgroups():
File "/usr/local/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/usr/local/lib/python3.11/site-packages/starlette/_utils.py", line 82, in collapse_excgroups
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 187, in __call__
response = await self.dispatch_func(request, call_next)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/server/fastapi/__init__.py", line 131, in check_http_version_middleware
return await call_next(request)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 163, in call_next
raise app_exc
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/base.py", line 149, in coro
await self.app(scope, receive_or_disconnect, send_no_error)
File "/usr/local/lib/python3.11/site-packages/starlette/middleware/exceptions.py", line 62, in __call__
await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 715, in __call__
await self.middleware_stack(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 735, in app
await route.handle(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 288, in handle
await self.app(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 76, in app
await wrap_app_handling_exceptions(app, request)(scope, receive, send)
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
raise exc
File "/usr/local/lib/python3.11/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
await app(scope, receive, sender)
File "/usr/local/lib/python3.11/site-packages/starlette/routing.py", line 73, in app
response = await f(request)
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 301, in app
raw_response = await run_endpoint_function(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/fastapi/routing.py", line 212, in run_endpoint_function
return await dependant.call(**values)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/telemetry/opentelemetry/__init__.py", line 134, in async_wrapper
return await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/server/fastapi/__init__.py", line 694, in get_collection
await to_thread.run_sync(
File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2441, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 943, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/telemetry/opentelemetry/__init__.py", line 150, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/api/segment.py", line 103, in wrapper
return self._rate_limit_enforcer.rate_limit(func)(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/rate_limit/simple_rate_limit/__init__.py", line 23, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/api/segment.py", line 293, in get_collection
existing = self._sysdb.get_collections(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/telemetry/opentelemetry/__init__.py", line 150, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/chroma/chromadb/db/mixins/sysdb.py", line 444, in get_collections
with self.tx() as cur:
File "/chroma/chromadb/db/impl/sqlite.py", line 41, in __enter__
self._conn.execute("BEGIN;")
File "/chroma/chromadb/db/impl/sqlite_pool.py", line 29, in execute
return self._conn.execute(sql)
^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.OperationalError: cannot start a transaction within a transaction
After this error the only option left is to retart the docker but issue happens like every 2-4 hours.
Versions
Aws linux - t2.2xlarge Docker chroma version - 0.5.20
Relevant log output