crewAI icon indicating copy to clipboard operation
crewAI copied to clipboard

[BUG] Cannot create 'Knowledge'

Open fbomb111 opened this issue 11 months ago • 2 comments

Description

I am following the documentation here: https://docs.crewai.com/concepts/knowledge#text-file-knowledge-source

Steps to Reproduce

my_knowledge = Knowledge( collection_name="my_knowledge", sources=[source1, source2, source3] )

I assume that my sources are valid because I tried with fake files and I get a file not found error. When I use real files, this error goes away.

I have tried with both docling and json source classes.

Expected behavior

I would expect no error.

Screenshots/Code snippets

See above.

Operating System

Ubuntu 20.04

Python Version

3.11

crewAI Version

0.95.0

crewAI Tools Version

0.25.8

Virtual Environment

Poetry

Evidence

my_knowledge = Knowledge( ^^^^^^^^^^ File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/knowledge.py", line 46, in init source.add() File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/source/crew_docling_source.py", line 88, in add self._save_documents() File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/source/base_knowledge_source.py", line 50, in _save_documents self.storage.save(self.chunks) File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 161, in save self.collection.upsert( File "/home/myproject/.venv/lib/python3.11/site-packages/chromadb/api/models/Collection.py", line 334, in upsert upsert_request = self._validate_and_prepare_upsert_request( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/myproject/.venv/lib/python3.11/site-packages/chromadb/api/models/CollectionCommon.py", line 93, in wrapper raise type(e)(msg).with_traceback(e.traceback) ^^^^^^^^^^^^ TypeError: APIStatusError.init() missing 2 required keyword-only arguments: 'response' and 'body'

Possible Solution

None

Additional context

None. Thanks for the help!

fbomb111 avatar Jan 06 '25 19:01 fbomb111

@lorenzejay - Is this similar to the issue you were talking about yesterday?

bhancockio avatar Jan 07 '25 18:01 bhancockio

Some more info if it's helpful, my AI provider is Azure Open AI. The docs say it will use the same provider as what the agents are configured to, but perhaps I need to provide a specific embedder to the crew? I can try this tomorrow.

If you have any ideas I'd be happy to dig in/test them out and report back. Thanks!

fbomb111 avatar Jan 08 '25 04:01 fbomb111

@fbomb111 See this issue: https://github.com/crewAIInc/crewAI/issues/769 You have to add an embedder for a non-OpenAI provider.

rupakg avatar Jan 10 '25 18:01 rupakg

@rupakg - Thanks for reply. I added the embedder but it doesn't make a difference. Because the code doesn't fail when it gets to the crew, it fails when trying to create the knowledge. Here's some pseudocode to demonstrate:

  1. create embedder config
  2. create knowledge source (failure is here)
  3. create crew with embedder and knowledge

Here is how I'm creating the embedder:

def _create_embedder(self):
	endpoint = os.getenv("AZUREAI_API_ENDPOINT")
	api_key = os.getenv("AZUREAI_API_KEY")
	embedder_config = {
		"provider": "azure",
		"config": {
			"endpoint": endpoint,
			"api_key": api_key,
			"model": "text-embedding-3-large", 
			"api_version": "2023-03-15-preview"
		}
	}
	return embedder_config

And the knowledge


def _create_knowledge_sources(self):
	mission_source = CrewDoclingSource(
		file_paths=["mission_guidelines.md"]
	)

	theme_source = JSONKnowledgeSource(
		file_paths=["theme.json"]
	)
	return mission_source, theme_source
	
def _create_knowledge(self):

	mission_source, theme_source = self. _create_knowledge_sources()
	research_source = CrewDoclingSource(
		file_paths=["research_results.md"]
	)

	knowledge = Knowledge(
		collection_name="myknowledge",
		sources=[research_source, mission_source, theme_source]
	)

	return knowledge

And the crew....

embedder= self._create_embedder()
knowledge  = self._create_knowledge()
crew = MyCrew(
	knowledge_sources= knowledge,
	embedder= embedder
)

It creates the sources okay, but errors when creating the Knowledge.

fbomb111 avatar Jan 18 '25 19:01 fbomb111

I have crewAI version 0.95.0.

I have my knowledge source as follows: (I have tried both .txt and .md files)

			company_pr_source = TextFileKnowledgeSource(
					file_paths=[
						"test.md"
					],
			)

I have added the embedder_config to my Agent like so:

		return Agent(
			config=self.agents_config['pr_specialist'],
			verbose=True,
			embedder_config={
				"provider": "ollama",
				"config": {
					"model": "nomic-embed-text"
				},
			},
			memory=True, 
			knowledge_sources=[company_pr_source]  # Agent-specific knowledge
		)

I am seeing an exception while creating the Knowledge collection - Exception: Failed to create or get collection failing at this line: https://github.com/crewAIInc/crewAI/blob/main/src/crewai/knowledge/storage/knowledge_storage.py#L107

I have tried TextFileKnowledgeSource.

The stack trace below:

  File "<path>/src/crewai_support_agent/crew.py", line 54, in pr_specialist
    return Agent(
           ^^^^^^
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/pydantic/main.py", line 214, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 140, in post_init_setup
    self._set_knowledge()
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 246, in _set_knowledge
    self._knowledge = Knowledge(
                      ^^^^^^^^^^
  File "<path>crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/knowledge.py", line 43, in __init__
    self.storage.initialize_knowledge_storage()
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 107, in initialize_knowledge_storage
    raise Exception("Failed to create or get collection")
Exception: Failed to create or get collection

PS: If I try to use CrewDoclingSource as knowledge source, then I get a different error as described in https://github.com/crewAIInc/crewAI/pull/1846

If anyone can help, I would greatly appreciate it.

rupakg avatar Jan 19 '25 21:01 rupakg

The docs have been updated since I created this issue with the proper way to do this. Apparently you no longer create the Knowledge object directly.

fbomb111 avatar Jan 26 '25 19:01 fbomb111