crewAI
crewAI copied to clipboard
[BUG] RagTool `_run` signature with `**kwargs` causes `args_schema` mismatch and runtime errors
Description
When an agent attempts to use the RagTool, it often fails during the tool invocation step. The root cause appears to be a mismatch between the RagTool._run method's signature, which includes **kwargs: Any, and the args_schema automatically generated by the BaseTool class.
The BaseTool's schema generation correctly identifies named arguments like query: str but ignores catch-all **kwargs. Consequently, the tool's description provided to the agent's LLM only lists query as an argument.
When the LLM decides to use the tool, it provides input like {"query": "some question"}. The CrewAI agent execution framework then tries to map this input to the _run method based on the generated args_schema. Because kwargs is not part of the schema, the framework's internal calling mechanism fails when attempting to invoke _run, leading to runtime errors related to argument mismatches, depending on the exact invocation logic.
Interestingly, calling rag_tool_instance.run(query="...") directly works fine, as Python implicitly handles the absent keyword arguments by setting kwargs to {} within _run. However, the framework's stricter, schema-driven invocation process does not replicate this behavior and fails.
Steps to Reproduce
from crewai import LLM, Agent, Task, Crew, Process
from crewai_tools import RagTool
import os
# Satisfy both LiteLLM and Embedchain
os.environ["GEMINI_API_KEY"] = "YOUR_KEY"
os.environ["GOOGLE_API_KEY"] = os.environ["GEMINI_API_KEY"]
embedchain_config = {
"embedder": {
"provider": "google",
"config": {
"model": "models/text-embedding-004",
"task_type": "RETRIEVAL_DOCUMENT"
}
}
}
rag_tool = RagTool(
config=embedchain_config,
summarize=False
)
rag_tool.add("https://www.anthropic.com/news/contextual-retrieval")
#
# 1 - Test if `RagTool.run()` works standalone
#
user_question = "How to rerank?"
relevant_chunks = rag_tool.run(user_question)
print("--- RagTool.run() Result ---")
print(relevant_chunks)
print("----------------------------")
#
# 2 - Test if an Agent configured with `RagTool` works
#
llm = LLM(
model="gemini/gemini-2.5-flash-preview-04-17",
temperature=0.1
)
knowledge_assistant = Agent(
role="Knowledge Assistant",
goal=(
"Answer questions accurately using only the provided knowledge "
"tools. Provide clear and complete explanations. If information "
"is not found, clearly indicate this."
),
backstory=(
"I am a dedicated knowledge assistant who always consults reliable "
"sources before answering. I prize accuracy and clarity, and I "
"never fabricate information. If my tools don't provide the "
"necessary data, I will acknowledge this limitation."
),
llm=llm,
tools=[rag_tool],
verbose=True,
allow_delegation=False
)
answer_task = Task(
description=(
"Answer the following question:\n'{question}'\n"
"Use available tools to search the knowledge base. Evaluate sources "
"after each query and make comprehensive searches to gather all "
"relevant information for a complete response.\n"
"Your answer must be:\n"
"1. Entirely based on information found in the knowledge base.\n"
"2. Explained clearly and completely.\n"
"3. If no relevant information is found, explicitly state that you "
"cannot answer based on the available sources."
),
expected_output=(
"A complete and accurate response based solely on RAG tool data, "
"or a clear statement that the information was not found."
),
agent=knowledge_assistant
)
knowledge_crew = Crew(
agents=[knowledge_assistant],
tasks=[answer_task],
process=Process.sequential,
verbose=True
)
result = knowledge_crew.kickoff(
inputs={"question": user_question}
)
print("\n--- Knowledge Assistant's Response ---")
print(result.raw)
print("--------------------------------------")
Expected behavior
An Agent configured with RagTool should be able to successfully execute the tool when the LLM provides the required query argument. The tool should query the underlying RAG adapter and return the relevant content without runtime errors related to method signature mismatches during invocation.
Screenshots/Code snippets
Code snippets are included in the 'Steps to Reproduce' section. Error logs/screenshots will be provided in the 'Evidence' section.
Operating System
Ubuntu 24.04
Python Version
3.12
crewAI Version
0.114.0
crewAI Tools Version
0.40.1
Virtual Environment
Venv
Evidence
Possible Solution
The most direct solution is to align the RagTool._run signature with the arguments recognized by the args_schema generation mechanism. This involves removing the **kwargs parameter from _run if it's not essential for the core functionality invoked via the agent framework.
The _before_run method should also be updated similarly if it only receives **kwargs passed down from _run.
In crewai_tools/tools/rag/rag_tool.py:
class RagTool(BaseTool):
# ... other code ...
def _run(
self,
query: str,
**kwargs: Any, # <- REMOVE this parameter
) -> Any:
self._before_run(query=query) # <- Pass only query
return f"Relevant Content:\n{self.adapter.query(query)}"
def _before_run(
self,
query: str,
**kwargs: Any # <- REMOVE this parameter
) -> None:
pass
This change makes the tool's signature directly match what the framework expects based on the schema, resolving the invocation error.
From a design perspective, this simplification aligns RagTool better with its role as a usable tool. While **kwargs can offer flexibility, its use in a base method signature conflicts with the schema generation in BaseTool, making the tool unusable in the standard agent flow. If specialized RAG tools need additional, specific parameters, they can define them explicitly in their own _run methods and corresponding args_schema, rather than relying on a **kwargs in the base class that breaks the framework's assumptions. This promotes clearer interfaces and adheres more closely to the principle that derived classes should predictably extend, not fundamentally alter, the contract implied by the base schema.
Additional context
This issue might also be relevant for other tools within crewai-tools that might use **kwargs in their _run methods without a corresponding mechanism to include them in the args_schema. The proposed fix simplifies the base RagTool and ensures its compatibility with CrewAI's core agent execution loop.
Hi @mouramax
Consequently, the tool's description provided to the agent's LLM only lists query as an argument. I just did a
rag_tool.description, here is what I got in response.
"Tool Name: Knowledge base\nTool Arguments: {'query': {'description': None, 'type': 'str'}, 'kwargs': {'description': None, 'type': 'Any'}}\nTool Description: Tool Name: Knowledge base\nTool Arguments: {'query': {'description': None, 'type': 'str'}, 'kwargs': {'description': None, 'type': 'Any'}}\nTool Description: Tool Name: Knowledge base\nTool Arguments: {'query': {'description': None, 'type': 'str'}, 'kwargs': {'description': None, 'type': 'Any'}}\nTool Description: A knowledge base that can be used to answer questions."
When the LLM decides to use the tool, it provides input like {"query": "some question"}. The CrewAI agent execution framework then tries to map this input to the _run method based on the generated args_schema. Because kwargs is not part of the schema, the framework's internal calling mechanism fails when attempting to invoke _run, leading to runtime errors related to argument mismatches, depending on the exact invocation logic.
I am not sure whether I understand this correctly, setting kwargs as Any should work completely fine, as per my understanding.
If I am able to see it correctly in the image
is kwargs is been passed as {}}}?
hey @mouramax nice to see you again (:
thank you for your reporting, you are 1000% right. Iām fixing it right now and will make sure/try to prevent any tools with this signature from showing up in the future
Hey @lucasgomide, great to see you again too!
Thanks for your prompt response, and I must admit that, based on what @Vidit-Ostwal presented, it seems kwargs is indeed being displayed in the args_schema. So, in this case, the LLM is just getting confused and doesn't know what to do with 'kwargs': {'description': None, 'type': 'Any'}.
This explains why, after several attempts, the code manages to run. At some point, the LLM tries to do something with kwargs and moves forward. But clearly, everything else applies: it really doesn't make sense to have **kwargs: Any.
Since you're going to handle this issue, I'd like to add that you should consider evaluating whether it's worth detecting the presence of arguments of type Any during the BaseTool's schema generation and logging at least a warning message alerting about the use of the Any type. This kind of error is hard to debug because the error message generated by Pydantic isn't very clear.
The final parsed tool looks like that
CrewStructuredTool(name='Knowledge base', description='Tool Name: Knowledge base
Tool Arguments: {'query': {'description': None, 'type': 'str'}, 'kwargs': {'description': None, 'type': 'Any'}}
Tool Description: A knowledge base that can be used to answer questions.')
We extracted the Tool Arguments from schema + run method signature. That's would be another discussion point, though.
@lucasgomide
what if we change the _generate_description() method to ignore any parameter with Any or specifically kwargs?
and let the _run method signature remain the same.
In this case this should be the description
CrewStructuredTool(name='Knowledge base', description='Tool Name: Knowledge base
Tool Arguments: {'query': {'description': None, 'type': 'str'},
Tool Description: A knowledge base that can be used to answer questions.')
@Vidit-Ostwal, just ignoring any parameter of Any type, or specifically kwargs, could lead to some unexpected behavior in certain tools, don't you think?
Look, I'll stick to what I said earlier: if these are going to be ignored at the source, there should at least be a warning message generated. I definitely believe a framework should make an effort to be verbose, especially when the underlying libraries aren't. It makes the whole debugging process so much better.
So, I agree with handling it at the base level, like you suggested. But I'd really love it if this was explicitly communicated, you know, to make things clearer for the user.
@Vidit-Ostwal, just ignoring any parameter of Any type, or specifically kwargs, could lead to some unexpected behavior in certain tools, don't you think?
I have somewhat mixed opinion on this
If the tool is not using kwargs, and we don't specify anything, I think it would work completely fine, kind of how your rag_tool_instance.run(query="...") works
But if a tool specifically uses *kwargs and we completely ignore it on the description side, then it would definitely cause an issue, so yeh approach I suggested doesn't work.
Look, I'll stick to what I said earlier: if these are going to be ignored at the source, there should at least be a warning message generated. I definitely believe a framework should make an effort to be verbose, especially when the underlying libraries aren't. It makes the whole debugging process so much better.
So, I agree with handling it at the base level, like you suggested. But I'd really love it if this was explicitly communicated, you know, to make things clearer for the user.
Definately a warning message needs to be delievered to user!
I believe if we removed all the *kwargs argument from all the tools, then this will be resolved.
Anyone who wants to use additional arguements, can just make a extend the BaseTool class, and write their own implementation.
Also, if we making changes, can we try to resolve this one issue as well #2606
hey folks! just merged one of the first PRs to address the issue of Tool being defined with **kwargs
Looking forward to getting more PRs like this in the coming days
Should be available in the next cut also
Amazing job, @lucasgomide! š
Awesome! @lucasgomide
I think the bigger plan would be to remove **kwargs from all the tools over time ā let me know if I can collaborate on some, happy to jump in.
@Vidit-Ostwal It is! I'm planning a couple of PR's types:
- In an ideal world, we'd be able to add any run time validations to prevent a Tool be defined with
**kwargs. For now, I'm just trying to catch if during code reviews - Remove and TEST (including unit tests) Tools. I found a couple of tools that were broken
Here you'll find a revamped version of the MySQLSearchTool. It ditches the infamous kwargs and comes with better error handling.
@Vidit-Ostwal It is! I'm planning a couple of PR's types:
- In an ideal world, we'd be able to add any run time validations to prevent a Tool be defined with
**kwargs. For now, I'm just trying to catch if during code reviews- Remove and TEST (including unit tests) Tools. I found a couple of tools that were broken
@lucasgomide Makes sense! Catching it in reviews works for now ā runtime checks would be great later. I think if you can make a planning PR that lists each tool with **kwargs and needs testing/changes, I can keep committing to it and marking the ones done. Would make collaboration much easier ā let me know!