chroma icon indicating copy to clipboard operation
chroma copied to clipboard

[Bug]: Using v1.0.3 client/server with a database created by v0.6.3 client/server causes `chromadb.errors.InternalError`

Open Davidyz opened this issue 9 months ago • 15 comments

What happened?

A collection.get() call using v1.0.3 client and v1.0.3 server on a collection created by v0.6.3 chromadb produced a chromadb.errors.InternalError: Error executing plan: Error sending backfill request to compactor.

This is the relevant code snippet:

await collection.get(
    where={"path": full_path_str},
    include=["metadatas"],
)

Aside from this error, I'm wondering whether a database created by 0.x.x chromadb is supposed to "just work"? Also, what's the expected outcome if a user has a mismatched chromadb client and server? I'm experiencing errors, and at the same time, I couldn't find many useful guides on what to expect over the upgrade (from 0.x.x to 1.x.x). The Discord channel (migrations) hasn't been updated for a few months. I'm a bit lost at this point because I don't know what to expect when trying to bump the chromadb version for my project.

Versions

Collection created by chromadb v0.6.3 and accessed by chromadb v1.0.3, on python 3.13, arch Linux

Relevant log output

Traceback (most recent call last):
  File "/home/davidyz/git/VectorCode/src/vectorcode/main.py", line 85, in async_main
    return_val = await vectorise(final_configs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 192, in vectorise
    await task
  File "/usr/lib/python3.13/asyncio/tasks.py", line 634, in _wait_for_one
    return f.result() if resolve else f
           ~~~~~~~~^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 43, in chunked_add
    await collection.get(
    ...<2 lines>...
    )
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/models/AsyncCollection.py", line 127, in get
    get_results = await self._client._get(
                  ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 134, in async_wrapper
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 471, in _get
    resp_json = await self._make_request(
                ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<11 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 149, in _make_request
    BaseHTTPClient._raise_chroma_error(response)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/base_http_client.py", line 96, in _raise_chroma_error
    raise chroma_error
chromadb.errors.InternalError: Error executing plan: Error sending backfill request to compactor

Davidyz avatar Apr 08 '25 11:04 Davidyz

@Davidyz, this is likely a HNSW initialization error. Rust errors a little harder to trace, I've just added a PR - #4219. I'll comeback with a way to further debug this.

tazarov avatar Apr 08 '25 13:04 tazarov

@Davidyz, try to build a new server image with the following dockerfile:

FROM rust:1.81.0 AS builder

ARG RELEASE_MODE=release

WORKDIR /chroma/

ENV PROTOC_ZIP=protoc-25.1-linux-x86_64.zip
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v25.1/$PROTOC_ZIP \
    && unzip -o $PROTOC_ZIP -d /usr/local bin/protoc \
    && unzip -o $PROTOC_ZIP -d /usr/local 'include/*' \
    && rm -f $PROTOC_ZIP

RUN git clone --branch trayan-04-08-fix_improving_backfil_error_propagation https://github.com/chroma-core/chroma.git .

RUN apt-get update && \
    apt-get install -y python3.11-dev

# Build dependencies first (for better caching)
RUN cargo build --bin chroma --release

#FROM gcr.io/distroless/cc-debian12
FROM debian:bookworm-slim

# Copy the binary from the build stage
COPY --from=builder /chroma/target/release/chroma /usr/local/bin/
RUN apt-get update && \
    apt-get install -y python3.11-dev
EXPOSE 8000

ENTRYPOINT ["chroma"]
CMD ["run","--path","/data","--host","0.0.0.0"]

Build it:

docker build -t chromarust -f Dockerfile .

Then run your server:

docker run -v ./<local_dir>:/data  -p 8000:8000 chromarust

Test it out and let me know what error you get then.

tazarov avatar Apr 08 '25 14:04 tazarov

Hi @tazarov , thanks for the instructions and apologies for replying late. Here's the error output using v1.0.3 client (the server is built from the dockerfile provided above):

Traceback (most recent call last):
  File "/home/davidyz/git/VectorCode/src/vectorcode/main.py", line 85, in async_main
    return_val = await vectorise(final_configs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 192, in vectorise
    await task
  File "/usr/lib/python3.13/asyncio/tasks.py", line 634, in _wait_for_one
    return f.result() if resolve else f
           ~~~~~~~~^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 43, in chunked_add
    await collection.get(
    ...<2 lines>...
    )
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/models/AsyncCollection.py", line 127, in get
    get_results = await self._client._get(
                  ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 134, in async_wrapper
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 471, in _get
    resp_json = await self._make_request(
                ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<11 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 149, in _make_request
    BaseHTTPClient._raise_chroma_error(response)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/base_http_client.py", line 96, in _raise_chroma_error
    raise chroma_error
chromadb.errors.InternalError: Error executing plan: Error sending backfill request to compactor: Error reading from metadata segment reader

EDIT: this error persists on v1.0.4 client

Davidyz avatar Apr 10 '25 07:04 Davidyz

hey @Davidyz, thanks for providing the logs. I see that now we do get, as anticipated, more feedback from the error messages. Looking at the code, this seems to be related to sqlite3.

Can I bother you to build another image from a new branch where I've added propagation of sqlite3 related errors:

FROM rust:1.81.0 AS builder

ARG RELEASE_MODE=release

WORKDIR /chroma/

ENV PROTOC_ZIP=protoc-25.1-linux-x86_64.zip
RUN curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v25.1/$PROTOC_ZIP \
    && unzip -o $PROTOC_ZIP -d /usr/local bin/protoc \
    && unzip -o $PROTOC_ZIP -d /usr/local 'include/*' \
    && rm -f $PROTOC_ZIP

RUN git clone --branch trayan-04-10-chore_local_compation_manager_error_propagation_for_sqlite https://github.com/chroma-core/chroma.git .

RUN apt-get update && \
    apt-get install -y python3.11-dev

# Build dependencies first (for better caching)
RUN cargo build --bin chroma --release

#FROM gcr.io/distroless/cc-debian12
FROM debian:bookworm-slim

# Copy the binary from the build stage
COPY --from=builder /chroma/target/release/chroma /usr/local/bin/
RUN apt-get update && \
    apt-get install -y python3.11-dev
EXPOSE 8000

ENTRYPOINT ["chroma"]
CMD ["run","--path","/data","--host","0.0.0.0"]

tazarov avatar Apr 10 '25 07:04 tazarov

Hi @tazarov thanks for the patch. Here's the new error message:

Traceback (most recent call last):
  File "/home/davidyz/git/VectorCode/src/vectorcode/main.py", line 85, in async_main
    return_val = await vectorise(final_configs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 192, in vectorise
    await task
  File "/usr/lib/python3.13/asyncio/tasks.py", line 634, in _wait_for_one
    return f.result() if resolve else f
           ~~~~~~~~^^
  File "/home/davidyz/git/VectorCode/src/vectorcode/subcommands/vectorise.py", line 43, in chunked_add
    await collection.get(
    ...<2 lines>...
    )
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/models/AsyncCollection.py", line 127, in get
    get_results = await self._client._get(
                  ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<10 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/telemetry/opentelemetry/__init__.py", line 134, in async_wrapper
    return await f(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 471, in _get
    resp_json = await self._make_request(
                ^^^^^^^^^^^^^^^^^^^^^^^^^
    ...<11 lines>...
    )
    ^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/async_fastapi.py", line 149, in _make_request
    BaseHTTPClient._raise_chroma_error(response)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^
  File "/home/davidyz/git/VectorCode/.venv/lib/python3.13/site-packages/chromadb/api/base_http_client.py", line 96, in _raise_chroma_error
    raise chroma_error
chromadb.errors.InternalError: Error executing plan: Error sending backfill request to compactor: Error reading from metadata segment reader: error occurred while decoding column 0: mismatched types; Rust type `u64` (as SQL type `INTEGER`) is not compatible with SQL type `BLOB`

This also applies to both 1.0.3 and 1.0.4

Davidyz avatar Apr 10 '25 08:04 Davidyz

hey @Davidyz, thanks so much for the quick turnaround. Once again, much appreciate you sticking with this and I can understand your frustration with this. As you point out things should "just work" and indeed this is the intent here :). Reality though is that bugs do creep in from time to time.

The error above starts to make sense now. My hypothesis at this point is misbehaving DB migration. Can I bother you for one last bit of feedback:

sqlite3 <persist_dir>/chroma.sqlite3 'select segment_id, hex(seq_id)  from max_seq_id;'

The above command will output something like this:

95f13d36-824b-4256-a9ce-d3dbd63a6a5e|0000000000001452
265719aa-cd49-486f-b6c9-35590ee09bcd|0000000000001388
63a8e0d4-8f05-42c9-a14b-ccf411b28abb|00000000000027DA
31c2b40c-10a1-47e5-bd40-e24903bdb925|00000000000028A4

Aslo this:

sqlite3<persist_dir>/chroma.sqlite3 ' 'select dir,version,filename,hash from migrations' 

Resulting in something like this:

embeddings_queue|1|00001-embeddings.sqlite.sql|d3755dfd232be8e8301f4d7fcfb3a486
embeddings_queue|2|00002-embeddings-queue-config.sqlite.sql|8fbfe4ffb3e57f1d8bfdc58510a82e85
sysdb|1|00001-collections.sqlite.sql|38352d725ad1c16074fac420b22b4633
sysdb|2|00002-segments.sqlite.sql|2913cb6a503055a95f625448037e8912
sysdb|3|00003-collection-dimension.sqlite.sql|42d22d0574d31d419c2a0e7f625c93aa
sysdb|4|00004-tenants-databases.sqlite.sql|048867ce8fcdefe4023c7110e4433591
sysdb|5|00005-remove-topic.sqlite.sql|b1367c826b8fba5f96f27befdc1d42d2
sysdb|6|00006-collection-segment-metadata.sqlite.sql|4eea7468935bf25d4604a0fed2366116
sysdb|7|00007-collection-config.sqlite.sql|1c7e63bba346a42a18b6ab7f1c989bed
sysdb|8|00008-maintenance-log.sqlite.sql|0a0e7e93111a01789addf64961c6127c
sysdb|9|00009-segment-collection-not-null.sqlite.sql|054355aef9e63702bf54ea29e61563f1
metadb|1|00001-embedding-metadata.sqlite.sql|2b4cf52c4bb2676e21d6860a4409f856
metadb|2|00002-embedding-metadata.sqlite.sql|12a570f7121b3a8ce750a2a7c36da20f
metadb|3|00003-full-text-tokenize.sqlite.sql|f97ad6334aeaa8f419f01110b648b97a
metadb|4|00004-metadata-indices.sqlite.sql|fb36603a45ee2cd0254cef3ef86585e8

tazarov avatar Apr 10 '25 08:04 tazarov

@tazarov, here is the output for the commands that you requested:

553d87e7-287f-4473-96f5-5e87fb692677|000000000000022C
embeddings_queue|1|00001-embeddings.sqlite.sql|d3755dfd232be8e8301f4d7fcfb3a486
embeddings_queue|2|00002-embeddings-queue-config.sqlite.sql|8fbfe4ffb3e57f1d8bfdc58510a82e85
sysdb|1|00001-collections.sqlite.sql|38352d725ad1c16074fac420b22b4633
sysdb|2|00002-segments.sqlite.sql|2913cb6a503055a95f625448037e8912
sysdb|3|00003-collection-dimension.sqlite.sql|42d22d0574d31d419c2a0e7f625c93aa
sysdb|4|00004-tenants-databases.sqlite.sql|048867ce8fcdefe4023c7110e4433591
sysdb|5|00005-remove-topic.sqlite.sql|b1367c826b8fba5f96f27befdc1d42d2
sysdb|6|00006-collection-segment-metadata.sqlite.sql|4eea7468935bf25d4604a0fed2366116
sysdb|7|00007-collection-config.sqlite.sql|1c7e63bba346a42a18b6ab7f1c989bed
sysdb|8|00008-maintenance-log.sqlite.sql|0a0e7e93111a01789addf64961c6127c
sysdb|9|00009-segment-collection-not-null.sqlite.sql|054355aef9e63702bf54ea29e61563f1
metadb|1|00001-embedding-metadata.sqlite.sql|2b4cf52c4bb2676e21d6860a4409f856
metadb|2|00002-embedding-metadata.sqlite.sql|12a570f7121b3a8ce750a2a7c36da20f
metadb|3|00003-full-text-tokenize.sqlite.sql|f97ad6334aeaa8f419f01110b648b97a
metadb|4|00004-metadata-indices.sqlite.sql|fb36603a45ee2cd0254cef3ef86585e8
metadb|5|00005-max-seq-id-int.sqlite.sql|0e9de46758761b373ce682925edcc326

Davidyz avatar Apr 10 '25 09:04 Davidyz

Hey @Davidyz, much appreciated. This tells me two things:

metadb|5|00005-max-seq-id-int.sqlite.sql|0e9de46758761b373ce682925edcc326 - migration from 0.6.3 -> 1.0.x was applied

553d87e7-287f-4473-96f5-5e87fb692677|000000000000022C - migration failed to apply as the second column is still in blob (big-endian encoded int).

Good news is that I am able to reproduce your error 100% and the fix for it is relatively simple. Just run:

sqlite3 <persist_dir>/chroma.sqlite3 "delete from migrations where dir ='metadb' and filename='00005-max-seq-id-int.sqlite.sql';" 

Then run your query again.

While the above will solve the issue you are having it doesn't explain how the migration script failed. Possibly due to another error that you've encountered along the way that may have prevented the migration from applying. I'll keep digging.

tazarov avatar Apr 10 '25 09:04 tazarov

Thanks! I'm glad I was able to help!

I'm maintaining a code repository indexing tool that acts as a context provider (MCP tools) for LLM applications, and the current version pinned chromadb to 0.6.3. Should I stay at 0.6.3 until this is fixed in a future release of chromadb? (If I unpin chromadb now, it'll work for new users, but for existing user,s it might break their setup.)

Davidyz avatar Apr 10 '25 09:04 Davidyz

@Davidyz, Chroma 1.0 brings lots of improvements, main of which is performance however, if you have a large based of existing users on 0.6.3 I would recommend pining the version for the time being until we figure the root cause of the failure and possibly fix it.

I will try to reproduce the upgrade error. Any particular steps you followed in your upgrade process that might be useful? Or did you just unpin and installed the latest available?

tazarov avatar Apr 10 '25 10:04 tazarov

Yes I just unpinned the version number in my pyproject.toml.

One thing that may be unusual is that I personally deployed chromadb by systemd, which is essentially the same as starting it from the CLI (native install with pipx, without docker). This has worked perfectly fine for me pre 1.0.0 so I chose to stick to it.

Davidyz avatar Apr 10 '25 10:04 Davidyz

@Davidyz should we keep this open or is everything ironed out for now?

jeffchuber avatar Apr 15 '25 23:04 jeffchuber

@Davidyz should we keep this open or is everything ironed out for now?

This bug (using old database with new client) is still there (at least in the latest release). @tazarov mentioned that he's trying to figure out what happened, so I think it's better to keep this open?

Davidyz avatar Apr 16 '25 09:04 Davidyz

@Davidyz, I tried to reproduce it with 0.6.3 -> 1.0.x upgrade with both server (docker) and CLI but in vain. I think there maybe either a step or a separate process which can hold locks to the sqlite3, but even then the whole application startup should generally fail as the db will be locked.

Perhaps it is worth following your exact process:

  • start chroma as a service using systemd - assuming a config similar to this - https://cookbook.chromadb.dev/running/systemd-service/#chroma-cli
  • Stop chroma sudo systemctl stop chroma
  • Upgrade with pipx
  • Start chroma - sudo systemctl start chroma

Correct the above to match your sequence.

tazarov avatar Apr 16 '25 09:04 tazarov

Yes these steps are what I used, except that I used a user service that can be created and managed without sudo:

# ~/.config/systemd/user/chromadb.service
[Unit]
Description = Chroma Service
After = network.target

[Service]
Type = simple
WorkingDirectory = /opt/chromadb
ExecStart=/home/davidyz/.local/bin/chroma run --host 127.0.0.1 --port 8000 --path /opt/chromadb/data --log-path /var/log/chromadb.log

[Install]
WantedBy = default.target

and then

systemctl start --user chromadb

Davidyz avatar Apr 19 '25 08:04 Davidyz

I just updated from 0.6 to 1.0.10 using the pip libraries and my program claimed that the existing collections were not present and created new ones. Is there a specific migration step I need to apply? My database is quite large -- du -sh shows 3.4GB total, but chroma.sqlite3 shows 5.0GB in ls; those disparities always confuse me.

I have backups so can restore from the original 0.6.3 database if necessary.

What should I do to either troubleshoot or resolve this?

Thanks!

n8ur avatar May 24 '25 16:05 n8ur

@n8ur, sorry to hear about your troubles. Normally there shouldn't be any migration required on the user part. Behind the scenes Chroma will apply an auto-migration of your DB (mainly includes sqlite3).

Can you run the following on your migrated DB:

Migrations applied:

sqlite3 <persist_dir>/chroma.sqlite3 "select dir,version,filename,hash from migrations;"

Get some info on collections/segments/embeddings

sqlite3 <persist_dir>/chroma.sqlite3 "SELECT c.name, c.id, s.id, m.seq_id, count(e.id) FROM collections c LEFT JOIN segments s ON c.id = s.collection LEFT JOIN max_seq_id m ON s.id = m.segment_id LEFT JOIN embeddings e ON s.id=e.segment_id;

Let's also check what's in your embeddings and whether there are some discrepancies:

sqlite3 <persist_dir>/chroma.sqlite3 "SELECT segment_id, count(*) from embeddings GROUP BY segment_id;"

Last but not least, let's have a look at your segments:

sqlite3 <persist_dir>/chroma.sqlite3 "SELECT * from segments;"

As far as the discrepancy goes, it can be due to sqlite3 and sparse files e.g. sqlite3 have pre-allocated some spaces which is visible to ls but not to du.

tazarov avatar May 24 '25 18:05 tazarov

Here are the outputs from the commands you suggested, run on the corrupted (?) database. A complication is that my program thought there were no collections defined, so it created new empty ones. That may be reflected in these results. I can copy over the original files and test with those if that would be helpful.

Before I give you the outputs, let me confess that I did not have the full sqlite3 packages installed before -- only the PIP dependencies that were install with chromadb. So I had to install the debian package to run the sqlite commands. Could the lack of those programs have been the problem?

root@timebot-dev:/var/lib/timebot/db/chromadb# sqlite3 ./chroma.sqlite3 "select dir,version,filename,hash from migrations;" embeddings_queue|1|00001-embeddings.sqlite.sql|d3755dfd232be8e8301f4d7fcfb3a486 embeddings_queue|2|00002-embeddings-queue-config.sqlite.sql|8fbfe4ffb3e57f1d8bfdc58510a82e85 sysdb|1|00001-collections.sqlite.sql|38352d725ad1c16074fac420b22b4633 sysdb|2|00002-segments.sqlite.sql|2913cb6a503055a95f625448037e8912 sysdb|3|00003-collection-dimension.sqlite.sql|42d22d0574d31d419c2a0e7f625c93aa sysdb|4|00004-tenants-databases.sqlite.sql|048867ce8fcdefe4023c7110e4433591 sysdb|5|00005-remove-topic.sqlite.sql|b1367c826b8fba5f96f27befdc1d42d2 sysdb|6|00006-collection-segment-metadata.sqlite.sql|4eea7468935bf25d4604a0fed2366116 sysdb|7|00007-collection-config.sqlite.sql|1c7e63bba346a42a18b6ab7f1c989bed sysdb|8|00008-maintenance-log.sqlite.sql|0a0e7e93111a01789addf64961c6127c sysdb|9|00009-segment-collection-not-null.sqlite.sql|054355aef9e63702bf54ea29e61563f1 metadb|1|00001-embedding-metadata.sqlite.sql|2b4cf52c4bb2676e21d6860a4409f856 metadb|2|00002-embedding-metadata.sqlite.sql|12a570f7121b3a8ce750a2a7c36da20f metadb|3|00003-full-text-tokenize.sqlite.sql|f97ad6334aeaa8f419f01110b648b97a metadb|4|00004-metadata-indices.sqlite.sql|fb36603a45ee2cd0254cef3ef86585e8 metadb|5|00005-max-seq-id-int.sqlite.sql|0e9de46758761b373ce682925edcc326

root@timebot-dev:/var/lib/timebot/db# sqlite3 ./chroma.sqlite3 "SELECT c.name, c.id, s.id, m.seq_id, count(e.id) FROM collections c LEFT JOIN segments s ON c.id = s.collection LEFT JOIN max_seq_id m ON s.id = m.segment_id LEFT JOIN embeddings e ON s.id=e.segment_id;" Error: in prepare, no such table: collections

root@timebot-dev:/var/lib/timebot/db# sqlite3 ./chroma.sqlite3 "SELECT segment_id, count(*) from embeddings GROUP BY segment_id;" Error: in prepare, no such table: embeddings

root@timebot-dev:/var/lib/timebot/db# sqlite3 ./chroma.sqlite3 "SELECT * from segments;" Error: in prepare, no such table: segments

Those last commands look like there's something seriously messed up...

Thanks, John

n8ur avatar May 24 '25 18:05 n8ur

@tazarov -- Just a thought. My chromadb init process is a bit complex as it checks for existence of collections and then creates them if not found. Could you suggest a simple python script I could try that would simply open the database and do migration steps with nothing more, reporting back results? That way we could rule out anything weird my program is doing.

n8ur avatar May 25 '25 16:05 n8ur

Before I give you the outputs, let me confess that I did not have the full sqlite3 packages installed before -- only the PIP dependencies that were install with chromadb. So I had to install the debian package to run the sqlite commands. Could the lack of those programs have been the problem?

I have sqlite3 on my system, but still ran into the error in the main issue. I don't think this is a big problem.

Davidyz avatar May 26 '25 02:05 Davidyz

After doing more testing today, I've learned a few things and solved the problem. The database was not being corrupted by the attempt to open it with v1.0.10, but my old code didn't work with the new library for two reasons (that I know of):

First, while the collections were supposedly created using "hnsw:space": "cosine", looking at the tables in chroma.sqlite3 shows that they instead use "hnsw:space": "l2". When opening the database, I passed metadata for the cosine space. In v0.6, that worked fine, but in 1.0.10 it results in an error. Apparently 1.0.10 is more stringent about this. Dropping the metadata param when opening for read solved the problem. But what I don't understand is why the command to create the database in 0.6.0 using cosine was ignored.

Second, the old bugbear of list_collections seems to have returned -- with v1.0.10, it seems to return a collection object again. You need something like this: collections = client.list_collections() if not collections: print("No collections found in the database.") return print(f"\nFound {len(collections)} collection(s):") for coll_obj in collections: collection_name = coll_obj.name collection_id = coll_obj.id print(f"\n--- Collection: '{collection_name}' (ID: {collection_id}) ---")

This is incredibly frustrating. Was it intended to revert this behavior? If so, that should have been documented!

Anyway, after addressing those issues, my code is working again.

n8ur avatar May 27 '25 20:05 n8ur

Dropping the metadata param when opening for read solved the problem.

Unfortunately, this doesn't fix the error for me. Do your collections contain other metadata fields? I'm using the metadata field to store some other information (paths, etc.), and I wonder whether that might have caused the different behaviours.

Davidyz avatar May 27 '25 23:05 Davidyz

So, as I understand it, the "metadata" field here isn't referring to the metadata stored with embeddings, but rather refers to the structure of the database itself. I was confused about that myself when it first came up.

I have to admit that I've been doing some vibe coding on this project, and GPT-4.1 told me about this, and suggested that the "metadata" field should only be used when creating or adding to a collection, not when querying it in read-only mode. That way your program will use the database as-is without attempting to force a structure.

n8ur avatar May 28 '25 01:05 n8ur

@tazarov -- Just a thought. My chromadb init process is a bit complex as it checks for existence of collections and then creates them if not found. Could you suggest a simple python script I could try that would simply open the database and do migration steps with nothing more, reporting back results? That way we could rule out anything weird my program is doing.

@n8ur, Chroma does migrations before it makes the database available for you to use, it usually takes a few millis at startup.

The difference between 0.6.x and 1.0.x is that in the former we apply the migrations using python whereas in the latter we use Rust implementation of that.

Regarding your issue with metadata, we've stopped using collection metadata for a while, in collection metadata it may appear as l2 but there's a separate config_json_str column that contains the collection config.

The lack of these tables is troubling indeed and even for a failed migration should not happen.

root@timebot-dev:/var/lib/timebot/db# sqlite3 ./chroma.sqlite3 "SELECT segment_id, count(*) from embeddings GROUP BY segment_id;"
Error: in prepare, no such table: embeddings

root@timebot-dev:/var/lib/timebot/db# sqlite3 ./chroma.sqlite3 "SELECT * from segments;"
Error: in prepare, no such table: segments

A few follow up questions:

  • Have you seen any errors on initial startup?
  • Is this a reproducible error? e.g. migration from 0.6.x to 1.0.x always causes this?
  • Have you tried on a different system?

tazarov avatar May 29 '25 05:05 tazarov

Hi @tazarov --

  1. With these two changes, I am not seeing errors on startup.

  2. Every time I tried to load the 0.6 database with 1.0.10 I had the same set of issues, resolved with the two changes noted.

  3. I have only tried this on the development system, but with a clean copy of the database copied from the production system (still on 0.6).

Re the sqlite3 messages, I might have made an error when I ran them before. I just reran, being more careful about what directory I was in, and got:

root@timebot-dev:/var/lib/timebot/db# sqlite3 ./chromadb/chroma.sqlite3 "SELECT segment_id, count(*) from embeddings GROUP BY segment_id;" 0fa33bc7-dead-4100-98ec-cedbee3bc6c9|26088 1118c57e-093b-4f34-8425-ece82da38cc9|3021 19c33fdb-3866-4790-9391-3960130a84f3|61575 200f5272-545c-414d-81af-f52f5636996c|887 31b5535a-12f0-4824-b324-b733183a2bc0|25053 548d8264-6d97-4d96-9fc0-f72b5a792175|6 5ab955f0-71f0-498e-9198-23a6c91e03f8|885 5d65132e-d713-4c1d-9a30-1c5e89f76605|170 6ac4344a-3825-4f21-9253-1874da8456fa|25223 91155b2f-cebe-4dd6-a08f-29ad5b637e91|104 e2794000-a1ef-4ca3-87b3-3d7037f0b9b3|109850

root@timebot-dev:/var/lib/timebot/db# sqlite3 ./chromadb/chroma.sqlite3 "SELECT * from segments;" 840bdee2-a526-4af0-b372-09e387dcb698|urn:chroma:segment/vector/hnsw-local-persisted|VECTOR|b3c1eaf1-2ea4-4446-833f-b8e20f79ebaa e2794000-a1ef-4ca3-87b3-3d7037f0b9b3|urn:chroma:segment/metadata/sqlite|METADATA|b3c1eaf1-2ea4-4446-833f-b8e20f79ebaa c8c105d5-e8c2-4b0e-b73b-be396910c429|urn:chroma:segment/vector/hnsw-local-persisted|VECTOR|782fc41f-e69f-4020-9191-50cb68db9369 19c33fdb-3866-4790-9391-3960130a84f3|urn:chroma:segment/metadata/sqlite|METADATA|782fc41f-e69f-4020-9191-50cb68db9369 ba19f074-b3f1-44fc-abe5-850f146b3c69|urn:chroma:segment/vector/hnsw-local-persisted|VECTOR|cd08df70-0840-49a3-ac6d-1f389297223b 0fa33bc7-dead-4100-98ec-cedbee3bc6c9|urn:chroma:segment/metadata/sqlite|METADATA|cd08df70-0840-49a3-ac6d-1f389297223b

Sorry for that red herring.

Basically, after I made the two changes to the code the system seems to be working fine for reading from the database.

So to be clear on the metadata, when I open a collection for either reading or writing, should I specify the space as I have been: collection = client.create_collection( name=collection_name, embedding_function=embedding_function, metadata={"hnsw:space": "cosine", "model": embedding_model_name} ) ? Or just delete the "metadata" param from all calls?

Thanks! John

n8ur avatar May 29 '25 15:05 n8ur

Hi, yes you can pass the metadata as you have been. closing here as completed

jairad26 avatar Jun 30 '25 21:06 jairad26

@jairad26 hi I'm still getting the following error when using the v1.0.15 chromadb server with a database created by 0.6.3:

chromadb.errors.InternalError: Error executing plan: Error sending backfill request to compactor: Error reading from metadata segment reader: error occurred while decoding column 0: mismatched types; Rust type `u64` (as SQL type `INTEGER`) is not compatible with SQL type `BLOB`

when sending a request to the server (for example, collection.count()). The collection existed in the database and were created with 0.6.3:

import chromadb
# the following imports convert a path to a collection name (str)
from vectorcode.cli_utils import expand_path
from vectorcode.common import get_collection_name 


def main():
    client = chromadb.HttpClient()
    collection = client.get_collection(get_collection_name(str(expand_path(".", True))))
    print(collection.count())


main()

Metadata isn't involved in this demo. Could you clarify what I need to do to make this work?

Davidyz avatar Jul 03 '25 04:07 Davidyz