ml-commons [FEATURE] Deep Research

Is your feature request related to a problem? Agent framework in ml-commons is not capable of solving complex tasks that require multiple steps and tools to execute. Moreover, the execution is synchronous and not suitable for long running tasks. Tools execute sequentially and do not offer the capability to search the web preventing the ability to solve ambiguous tasks that require more context.

Currently we have three types of agent (doc):

Flow agent
Conversational flow agent
Conversational agent

These existing agent types have limitations in supporting complex logic. Specifically, they lack the ability to:

Implement conditional workflows
Execute tools in parallel
Handle branching and merging of execution paths
Manage dependencies between tasks
Asynchronous execution

This limitation restricts the creation of more sophisticated and efficient workflows within OpenSearch.

With the introduction of deep research in service providers like OpenAI, Gemini, Perplexity, etc, there is a need for such an agent capable of breaking down a task into simple steps and executing them with the help of the provided tools asynchronously.

It can also help save costs by invoking cheaper & faster LLMs for smaller tasks and reducing the number of inferences required.

Examples:

Root Cause Analysis (RCA) for an Error:

Identify the source of recurring 500 Internal Server Error logs.
Retrieve related warnings, deployment changes, or traffic anomalies.
Generate a summary with potential causes and recommended actions.

Log Anomaly Detection:

Analyze the last 24 hours of logs to detect spikes.
Identify affected endpoints, services, or patterns.
Correlate findings with recent system changes.

What solution would you like? A new Deep Research Agent in OpenSearch ML Commons that can:

Break down complex tasks into simpler steps
Use appropriate tools dynamically for each step
Execute tasks serially or in parallel, depending on dependencies
Support function calling to leverage external services efficiently Ref: https://docs.anthropic.com/en/docs/build-with-claude/tool-use/overview
Re-evaluate progress and refine execution based on intermediate results
Handle failures intelligently, including retries, fallbacks, and logging
Execute the task asynchronously and update the status accordingly
Incorporate web search capabilities where external context is required

There is a feature request for a graph agent already that tries to address the problem: https://github.com/opensearch-project/ml-commons/issues/3309

However, this feature focusses more on the automatic breakdown of complex tasks and tool execution rather than only a DAG style execution of tools.

Mar 13 '25 23:03 pyek-bot

Suggestion:

Add architecture and workflow diagram
Split into phases for faster delivery

Mar 14 '25 04:03 ylwu-amzn

RFC: https://github.com/opensearch-project/ml-commons/issues/3745 Framework to internally support deep-research

Apr 20 '25 23:04 pyek-bot

We can close this issue as it has been merged and is now called the PlanExecuteReflect Agent which with tools behaves as a deep research agent.

Aug 14 '25 00:08 pyek-bot

ml-commons ml-commons copied to clipboard

[FEATURE] Deep Research

ml-commons
ml-commons copied to clipboard