autogen icon indicating copy to clipboard operation
autogen copied to clipboard

[Bug]: Retrieving existing collection ignores custom embedding_function when using ChromaVectorDB

Open yonitjio opened this issue 9 months ago • 0 comments

Describe the bug

Retrieving existing collection ignores custom embedding_function when using ChromaVectorDB.

Steps to reproduce

Setup custom embedding function:

embeeding_function = embedding_functions.OpenAIEmbeddingFunction(
                api_key="__localai__",
                api_base="http://localhost:8080/v1",
                model_name="bert-minilm-embedding"
            )

rag_vector_db = ChromaVectorDB(path="./data", embedding_function=embeeding_function)

admin_helper = RetrieveUserProxyAgent(
    name="admin_helper",
    description="Assistant who has extra content retrieval power for solving difficult problems.",
    human_input_mode="NEVER",
    is_termination_msg=termination_msg,
    default_auto_reply="Reply `TERMINATE` if the task is done.",
    retrieve_config={
        "task": "qa",
        "vector_db": rag_vector_db,
        "collection_name": "odoo_table_context",
        "docs_path": "./odoo-table-context.jsonl",
        "get_or_create": True,
    },
    max_consecutive_auto_reply=3,
    code_execution_config=False,
)

Run it twice.

First run it's alright. chromadb py - autogen - Visual Studio Code 29_04_2024 08_01_42

Second run gives default embedding function. chromadb py - autogen - Visual Studio Code 29_04_2024 08_04_25

Model Used

No response

Expected Behavior

It should always use the provided embedding function.

Screenshots and logs

No response

Additional Information

This line should be using the custom embedding function. https://github.com/microsoft/autogen/blob/a9171211c7533fc11e078899720a3847f45807cc/autogen/agentchat/contrib/vectordb/chromadb.py#L129

self.active_collection = self.client.get_collection(collection_name, embedding_function=self.embedding_function)

yonitjio avatar Apr 29 '24 01:04 yonitjio