chroma
chroma copied to clipboard
Collection name validation error message should mention lowercase or permit uppercase
When calling create_collection with the name set to OpenAIEmbeddings, I get the following traceback message:
ERROR: Traceback (most recent call last):
[...]
File "[...]/chromadb/api/local.py", line 60, in create_collection
check_index_name(name)
File "[...]/chromadb/api/local.py", line 36, in check_index_name
raise ValueError(msg)
ValueError: Expected collection name that (1) contains 3-63 characters, (2) starts and ends with an alphanumeric character, (3) otherwise contains only alphanumeric characters, underscores or hyphens (-), (4) contains no two consecutive periods (..) and (5) is not a valid IPv4 address, got OpenAIEmbeddings
When I change my collection name from OpenAIEmbeddings to openaiembeddings, the error is successfully resolved.
However, the error message does not mention that the collection name needs to be lowercased, only that it needs to be alphanumeric. The error message text should either (1) specifically call out that it expects collection names to be lowercased-alphanumeric, or (2) start permitting uppercase too :)
Hi @csvoss, thanks for the report!
You're absolutely right. We're currently working on a refactor that (among other things) will significantly relax and clarify naming standards: under the new approach, any string that would be valid as a URI path segment will be valid as a topic name (and will also be scoped by owner, in our future multi-tenant SaaS offering.)
A PR including these more relaxed naming conventions will most likely be merged within the week.
Rules have been relaxed https://github.com/chroma-core/chroma/blob/c3b397dfc469807b2fb77c579c8c5890b707dcc2/chromadb/api/local.py#L35