mgeneratejs
mgeneratejs copied to clipboard
Add Async $text Operator for LLM-Integrated Data Generation with Ollama
This PR introduces a new $text operator to Mongo's mgenerate tool, allowing integration with Large Language Models (LLMs) using Ollama API to generate contextually relevant text data based on user-defined prompts. This enhancement significantly improves the tool's capability to create more specific and meaningful dummy data, addressing various application use cases, such as:
- Application-Specific Data: Generate tailored data for specific domains (e.g., healthcare job titles).
- Long Text Generation: Produce coherent, context-appropriate long text (e.g., product reviews).
- Regional Contextualization: Generate data with regional relevance (e.g., Indian names).
Key Changes:
- Added a new $text operator in mgenerate with Ollama integration.
- Integrated LLM model via Ollama
- Converted mgenerate into an asynchronous library to support LLM integration.
- Updated documentation to include usage examples and details for the new $text operator.
Example Usage:
{
"name": "$name",
"Role": {
"$text": {
"prompt": "Rare Designation or job title found in Healthcare",
"maxWordCount": "4"
}
},
"lastLogin": "$now"
}
Example Output (model: mistral-nemo):
{
"name": "Virginia Blair",
"Role": "Medical Assistant",
"lastLogin": {
"$date": "2024-07-28T12:53:00.267Z"
}
}