crewAI
crewAI copied to clipboard
Added Advance Configuration Docs for Rag Tool
Added Additional docs for the RAG tool advance configuration fixes #2671
Disclaimer: This review was made by a crew of AI Agents.
Thank you for the contribution to enhance the RAG Tool documentation with an advanced configuration section. This update substantially improves the usability for advanced users by providing a realistic example configuration covering embedding, vector database (Elasticsearch), and chunking parameters. Including the note about the Embedchain adapter and linking to its documentation is a valuable addition that helps users understand the integration points.
Here are some detailed findings and specific suggestions for improvement to elevate the quality and clarity of this documentation update:
Key Findings
- The new "Advance Configuration" section is well-structured and presents the configuration example clearly, making it easier for users to customize the
RagTool. - The code snippet uses consistent indentation and formatting, making it readable.
- Linking to the Embedchain documentation improves discoverability and provides a good external reference.
Suggestions for Improvement
-
Section Heading Typo
- Change the heading from:
to the grammatically correct:## Advance Configuration## Advanced Configuration
This small fix avoids any perception of haste and improves professionalism.
- Change the heading from:
-
Sensitive Data Management
- The example contains
"api_key": "your-key", which could unintentionally encourage users to hardcode secrets. - Add an inline comment to the example and a clear note just below the code snippet emphasizing secure secrets handling.
Example inline comment:
"api_key": "your-key", # Use environment variables or a secret manager for sensitive keysAnd a note below the snippet:
Note: Never hardcode secrets or API keys in configuration files. Use environment variables or secret management services.
- The example contains
-
Boolean Values Formatting
- In the example,
verify_certsis set toFalse— this follows Python style, but since this config is shown in a JSON-like style within MDX/Markdown, it's best to use JSON boolean lowercase for clarity:
"verify_certs": false - In the example,
-
Enhance Context on Embedchain
- The current sentence:
The internal RAG tool utilizes the Embedchain adapter...
Could be improved by briefly describing Embedchain’s role:
The internal RAG tool utilizes the [Embedchain](https://docs.embedchain.ai/components/introduction) adapter, an extensible framework for retrieval-augmented generation, allowing you to pass any supported configuration options.This helps users unfamiliar with Embedchain understand its purpose quickly.
- The current sentence:
-
Clarify
.yamlFile Reference- The statement:
Make sure to review the configuration options available in the .yaml file.
Could confuse users about which YAML file is referenced.
Recommendation:
Make sure to review the configuration options available in your project's Embedchain `.yaml` configuration file (see the Embedchain documentation for details).This clarifies that it’s a project-specific configuration file.
- The statement:
Example of Suggested Revised Snippet
## Advanced Configuration
You can use advanced configuration for the RAG Tool by passing supported options to the `RagTool` class. Here’s an example:
```python
config = {
"embedding": {
"provider": "openai",
"config": {
"model": "text-embedding-ada-002"
}
},
"vectordb": {
"provider": "elasticsearch",
"config": {
"collection_name": "my-collection",
"cloud_id": "deployment-name:xxxx",
"api_key": "your-key", # Use environment variables or a secret manager for sensitive keys
"verify_certs": false
}
},
"chunker": {
"chunk_size": 400,
"chunk_overlap": 100,
"length_function": "len",
"min_chunk_size": 0
}
}
rag_tool = RagTool(config=config, summarize=True)
Note: Never hardcode secrets or API keys in configuration files. Use environment variables or secret management services.
The internal RAG tool utilizes the Embedchain adapter, an extensible framework for retrieval-augmented generation, allowing you to pass any configuration options that are supported.
Make sure to review the configuration options available in your project's Embedchain .yaml configuration file (see the Embedchain documentation for details).
Conclusion
### Summary Table of Issues & Recommendations
| Issue | Recommendation |
|------------------------------|------------------------------------------------------------|
| Typo in section heading | Change "Advance" → "Advanced" |
| Hardcoded API key example | Add inline comment and note about secrets management |
| Boolean capitalization | Use lowercase `false` instead of `False` |
| Embedchain context missing | Add brief description about Embedchain |
| Ambiguous `.yaml` file ref | Specify reference to project’s Embedchain config file |
---
### Overall Assessment
This PR adds substantial value to the documentation, aiding advanced users in customizing the RAG Tool. The inclusion of an example config with embedding, vector DB, and chunking parameters is well done. Linking to the external Embedchain docs helps maintain a modular documentation approach.
With the recommended minor corrections—mostly editorial and security best practices—this contribution will be well polished, professional, and educational.
Thank you for your efforts on improving the crewAI project documentation! Looking forward to the updated version with these suggested enhancements.
@lucasgomide, Any idea on this one, the test case is not failing, I think the run time is taking a long time >15 mins. Not sure why?
There are several failling tests
tests/crew_test.py::test_crew_with_delegating_agents FAILED [ 35%]
tests/crew_test.py::test_crew_with_delegating_agents_should_not_override_task_tools PASSED [ 35%]
tests/crew_test.py::test_crew_with_delegating_agents_should_not_override_agent_tools PASSED [ 35%]
tests/crew_test.py::test_task_tools_override_agent_tools FAILED [ 35%]
tests/crew_test.py::test_task_tools_override_agent_tools_with_allow_delegation PASSED [ 35%]
tests/crew_test.py::test_crew_verbose_output FAILED [ 35%]
tests/crew_test.py::test_cache_hitting_between_agents FAILED [ 35%]
tests/crew_test.py::test_api_calls_throttling FAILED [ 35%]
tests/crew_test.py::test_crew_kickoff_usage_metrics FAILED
There are several failling tests
tests/crew_test.py::test_crew_with_delegating_agents FAILED [ 35%] tests/crew_test.py::test_crew_with_delegating_agents_should_not_override_task_tools PASSED [ 35%] tests/crew_test.py::test_crew_with_delegating_agents_should_not_override_agent_tools PASSED [ 35%] tests/crew_test.py::test_task_tools_override_agent_tools FAILED [ 35%] tests/crew_test.py::test_task_tools_override_agent_tools_with_allow_delegation PASSED [ 35%] tests/crew_test.py::test_crew_verbose_output FAILED [ 35%] tests/crew_test.py::test_cache_hitting_between_agents FAILED [ 35%] tests/crew_test.py::test_api_calls_throttling FAILED [ 35%] tests/crew_test.py::test_crew_kickoff_usage_metrics FAILED
Oh, will check those out.
@Vidit-Ostwal I just noticed something weird with our test suite I’m looking into it right now