[BUG] Cannot create 'Knowledge'
Description
I am following the documentation here: https://docs.crewai.com/concepts/knowledge#text-file-knowledge-source
Steps to Reproduce
my_knowledge = Knowledge( collection_name="my_knowledge", sources=[source1, source2, source3] )
I assume that my sources are valid because I tried with fake files and I get a file not found error. When I use real files, this error goes away.
I have tried with both docling and json source classes.
Expected behavior
I would expect no error.
Screenshots/Code snippets
See above.
Operating System
Ubuntu 20.04
Python Version
3.11
crewAI Version
0.95.0
crewAI Tools Version
0.25.8
Virtual Environment
Poetry
Evidence
my_knowledge = Knowledge( ^^^^^^^^^^ File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/knowledge.py", line 46, in init source.add() File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/source/crew_docling_source.py", line 88, in add self._save_documents() File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/source/base_knowledge_source.py", line 50, in _save_documents self.storage.save(self.chunks) File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 161, in save self.collection.upsert( File "/home/myproject/.venv/lib/python3.11/site-packages/chromadb/api/models/Collection.py", line 334, in upsert upsert_request = self._validate_and_prepare_upsert_request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/myproject/.venv/lib/python3.11/site-packages/chromadb/api/models/CollectionCommon.py", line 93, in wrapper raise type(e)(msg).with_traceback(e.traceback) ^^^^^^^^^^^^ TypeError: APIStatusError.init() missing 2 required keyword-only arguments: 'response' and 'body'
Possible Solution
None
Additional context
None. Thanks for the help!
@lorenzejay - Is this similar to the issue you were talking about yesterday?
Some more info if it's helpful, my AI provider is Azure Open AI. The docs say it will use the same provider as what the agents are configured to, but perhaps I need to provide a specific embedder to the crew? I can try this tomorrow.
If you have any ideas I'd be happy to dig in/test them out and report back. Thanks!
@fbomb111 See this issue: https://github.com/crewAIInc/crewAI/issues/769 You have to add an embedder for a non-OpenAI provider.
@rupakg - Thanks for reply. I added the embedder but it doesn't make a difference. Because the code doesn't fail when it gets to the crew, it fails when trying to create the knowledge. Here's some pseudocode to demonstrate:
- create embedder config
- create knowledge source (failure is here)
- create crew with embedder and knowledge
Here is how I'm creating the embedder:
def _create_embedder(self):
endpoint = os.getenv("AZUREAI_API_ENDPOINT")
api_key = os.getenv("AZUREAI_API_KEY")
embedder_config = {
"provider": "azure",
"config": {
"endpoint": endpoint,
"api_key": api_key,
"model": "text-embedding-3-large",
"api_version": "2023-03-15-preview"
}
}
return embedder_config
And the knowledge
def _create_knowledge_sources(self):
mission_source = CrewDoclingSource(
file_paths=["mission_guidelines.md"]
)
theme_source = JSONKnowledgeSource(
file_paths=["theme.json"]
)
return mission_source, theme_source
def _create_knowledge(self):
mission_source, theme_source = self. _create_knowledge_sources()
research_source = CrewDoclingSource(
file_paths=["research_results.md"]
)
knowledge = Knowledge(
collection_name="myknowledge",
sources=[research_source, mission_source, theme_source]
)
return knowledge
And the crew....
embedder= self._create_embedder()
knowledge = self._create_knowledge()
crew = MyCrew(
knowledge_sources= knowledge,
embedder= embedder
)
It creates the sources okay, but errors when creating the Knowledge.
I have crewAI version 0.95.0.
I have my knowledge source as follows: (I have tried both .txt and .md files)
company_pr_source = TextFileKnowledgeSource(
file_paths=[
"test.md"
],
)
I have added the embedder_config to my Agent like so:
return Agent(
config=self.agents_config['pr_specialist'],
verbose=True,
embedder_config={
"provider": "ollama",
"config": {
"model": "nomic-embed-text"
},
},
memory=True,
knowledge_sources=[company_pr_source] # Agent-specific knowledge
)
I am seeing an exception while creating the Knowledge collection - Exception: Failed to create or get collection failing at this line: https://github.com/crewAIInc/crewAI/blob/main/src/crewai/knowledge/storage/knowledge_storage.py#L107
I have tried TextFileKnowledgeSource.
The stack trace below:
File "<path>/src/crewai_support_agent/crew.py", line 54, in pr_specialist
return Agent(
^^^^^^
File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/pydantic/main.py", line 214, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 140, in post_init_setup
self._set_knowledge()
File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 246, in _set_knowledge
self._knowledge = Knowledge(
^^^^^^^^^^
File "<path>crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/knowledge.py", line 43, in __init__
self.storage.initialize_knowledge_storage()
File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 107, in initialize_knowledge_storage
raise Exception("Failed to create or get collection")
Exception: Failed to create or get collection
PS: If I try to use CrewDoclingSource as knowledge source, then I get a different error as described in https://github.com/crewAIInc/crewAI/pull/1846
If anyone can help, I would greatly appreciate it.
The docs have been updated since I created this issue with the proper way to do this. Apparently you no longer create the Knowledge object directly.