crewAI icon indicating copy to clipboard operation
crewAI copied to clipboard

RAG implementation in CREWAI

Open NeethanShetty07 opened this issue 1 year ago • 11 comments

Hi,

Is there any method implemented which we can link RAG method with agents.

NeethanShetty07 avatar Feb 13 '24 11:02 NeethanShetty07

You could run off this branch https://github.com/joaomdmoura/crewAI/pull/246 and use a LangChain agent with RAG? :)

ZmeiGorynych avatar Feb 17 '24 12:02 ZmeiGorynych

Similar to issue #18

RAG with tools example here: https://mer.vin/2024/02/crewai-rag-using-tools/

RAG will be one of the first tools on our crew toolkit

-- Joao (@joaomdmoura)

slavakurilyak avatar Feb 18 '24 10:02 slavakurilyak

RAG is now part of crewai-tools

slavakurilyak avatar Feb 21 '24 02:02 slavakurilyak

Hi @slavakurilyak Thanks for the response

NeethanShetty07 avatar Feb 21 '24 04:02 NeethanShetty07

Can we attach any pdf or any document and do RAG on these documents using Agents ??

NeethanShetty07 avatar Feb 21 '24 04:02 NeethanShetty07

Can we attach any pdf or any document and do RAG on these documents using Agents ??

@NeethanShetty07 Yeah, I got it working, here's a simple example with .txt file:

from crewai_tools.tools import TXTSearchTool
rag_tool = TXTSearchTool(txt='./sample_data/test.txt')

data_analyst = Agent(
  role='Data Analyst',
  goal='You perfectly know how to analyze any data using provided txt file and searching info via RAG tool',
  backstory='You are data expert',
  verbose=True,
  allow_delegation=False,
  tools=[rag_tool]
)

test_task = Task(
  description="Show me a company name",
  tools=[rag_tool],
  agent=data_analyst
)

test_task.execute()

ilyasudakov avatar Feb 24 '24 09:02 ilyasudakov

But i have AzureOPENAI key , But TXTSearchTool is asking OPENAI key for the process , how to do solve this issue?

Neethan54 avatar Mar 27 '24 13:03 Neethan54

I am having the exact same problem as Neethan54 can anyone please let us know how we pass in the key from AzureOpenAI for the TXTSearchtool?

P7201 avatar Mar 29 '24 04:03 P7201

Bump.

Do tools need Open AI keys? I'm trying to use the PDFSearchTool with Mistral, assumed I'd only need OpenAI keys for using an Open AI llm.

SL13PNIR avatar Apr 15 '24 23:04 SL13PNIR

Im having the same issue as @Neethan54 and @P7201 , for Azure OpenAI for PDFSearchTool

thekizoch avatar May 01 '24 19:05 thekizoch

I'd like to understand the same. How to use Azure OpenAI for PDFSearchTool?

manojselvakumar avatar May 16 '24 18:05 manojselvakumar

how to use groq and other embeddings on PDFSearchTool anyone knows??

ItzmeAkash avatar May 28 '24 07:05 ItzmeAkash

It seems embedchain is making changes to their config for embeddings. Does this help get PDFSearchTool working for Azure? Is there an example of how to get the config working with an Azure subscription?

theholymath avatar Jun 17 '24 20:06 theholymath

From the theory, the way tools (function calling) work is by comparing the request's embedding with the tool's description's embedding (vector distance measurement). Therefore tools require an embedding_model. I checked the source code and did the following. The CrewAI framework checks if there is an embedding_model_config given and if not it tries to use OpenAI Embeddings by default. To solve this I used a config dict you can see from the below example:

    rag_tool = DOCXSearchTool(
        docx='C:\\Repos\\myproject\\myfolders\\mydocument.docx',
        config={"embedding_model": {
            "provider": "huggingface",
            "config": { "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"}
            }
        }
    )

Later, the CrewAI framework asked me to download the "docx2txt" package to work, which you can do by running this command: "pip install docx2txt" and it worked without a problem afterwards.

You can change the model with another HuggingFace embedding model and it should work.

I also had a similar problem with the memory which I solved similarly: https://github.com/joaomdmoura/crewAI/issues/769#issuecomment-2190041975

cbarkinozer avatar Jun 26 '24 20:06 cbarkinozer

[cbarkinozer] Thanks for providing the config. it's working with TXTSearchTool also,

    config=dict(
        llm=dict(
            provider="google", 
            config=dict(
                model="gemini-1.5-flash",
            )   
        ),
        embedding_model ={"provider": "huggingface",
                  "config": { "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2"}
                  }
        ),
    )

suvom24 avatar Jul 22 '24 13:07 suvom24