crewAI icon indicating copy to clipboard operation
crewAI copied to clipboard

Add Elasticsearch integration for RAG storage

Open devin-ai-integration[bot] opened this issue 8 months ago • 2 comments

Elasticsearch Integration for RAG Storage

This PR adds support for using Elasticsearch as an alternative to ChromaDB for RAG (Retrieval Augmented Generation) storage in CrewAI. This allows users to leverage Elasticsearch's powerful search capabilities and scalability for their AI agents.

Changes

  • Added ElasticsearchStorage class for memory storage
  • Added ElasticsearchKnowledgeStorage class for knowledge storage
  • Created a storage factory to make it easy to switch between storage backends
  • Updated memory and knowledge classes to support Elasticsearch
  • Added tests for Elasticsearch integration
  • Added documentation for Elasticsearch integration

How to Use

Memory Storage

crew = Crew(
    agents=[agent],
    tasks=[task],
    memory_config={
        "provider": "elasticsearch",
        "host": "localhost",  # Optional
        "port": 9200,         # Optional
        "username": "user",   # Optional
        "password": "pass",   # Optional
    },
)

Knowledge Storage

knowledge = Knowledge(
    collection_name="test",
    sources=[source],
    storage_provider="elasticsearch",
)

Testing

The implementation has been tested with unit tests and integration tests. The tests can be run with:

RUN_ELASTICSEARCH_TESTS=true pytest tests/memory/elasticsearch_storage_test.py tests/knowledge/elasticsearch_knowledge_storage_test.py tests/integration/elasticsearch_integration_test.py

Fixes #2671

Link to Devin run: https://app.devin.ai/sessions/16f5a16622f74eaebce48df6a8a348d5 Requested by: Joe Moura ([email protected])

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • [ ] Disable automatic comment and CI monitoring

Elasticsearch Integration for RAG Storage

This PR adds support for using Elasticsearch as an alternative to ChromaDB for RAG (Retrieval Augmented Generation) storage in CrewAI. This allows users to leverage Elasticsearch's powerful search capabilities and scalability for their AI agents.

Changes

  • Added ElasticsearchStorage class for memory storage
  • Added ElasticsearchKnowledgeStorage class for knowledge storage
  • Created a storage factory to make it easy to switch between storage backends
  • Updated memory and knowledge classes to support Elasticsearch
  • Added tests for Elasticsearch integration
  • Added documentation for Elasticsearch integration

How to Use

Memory Storage

crew = Crew(
    agents=[agent],
    tasks=[task],
    memory_config={
        "provider": "elasticsearch",
        "host": "localhost",  # Optional
        "port": 9200,         # Optional
        "username": "user",   # Optional
        "password": "pass",   # Optional
    },
)

Knowledge Storage

knowledge = Knowledge(
    collection_name="test",
    sources=[source],
    storage_provider="elasticsearch",
)

Testing

The implementation has been tested with unit tests and integration tests. The tests can be run with:

RUN_ELASTICSEARCH_TESTS=true pytest tests/memory/elasticsearch_storage_test.py tests/knowledge/elasticsearch_knowledge_storage_test.py tests/integration/elasticsearch_integration_test.py

Fixes #2671

Link to Devin run: https://app.devin.ai/sessions/16f5a16622f74eaebce48df6a8a348d5 Requested by: Joe Moura ([email protected])

Including the Elasticsearch index name is also important, as it determines where the data will be stored. If the specified index does not exist, it should be created automatically.

HarikrishnanK9 avatar Apr 23 '25 06:04 HarikrishnanK9

Closing due to inactivity for more than 7 days.