Planner loops and skips retry path in Researcher sample when Critique is negative (Embabel 0.1.3)
On Embabel 0.1.3, when I adapt the Researcher.kt sample to force a negative Critique, the planner:
-
does not choose the expected retry path (redoResearchWithGpt4 → mergeReports → critiqueMergedReport), and
-
gets stuck repeatedly selecting researchWithGpt4.
This happens even though a Critique object is present on the blackboard, which should enable the conditional branch that re-does research.
What I changed (minimal)
I took Researcher.kt and:
-
Removed parallel execution.
-
Forced Critique to negative.
Kept everything else minimal to isolate planning/conditions.
Expected flow
redoResearchWithGpt4 → mergeReports → critiqueMergedReport
Actual flow
researchWithGpt4 → acceptReport
(repeats researchWithGpt4 on subsequent planner decisions)
import com.embabel.agent.api.annotation.*
import com.embabel.agent.api.common.OperationContext
import com.embabel.agent.api.common.create
import com.embabel.agent.config.models.AnthropicModels
import com.embabel.agent.config.models.OpenAiModels
import com.embabel.agent.core.CoreToolGroups
import com.embabel.agent.domain.io.UserInput
import com.embabel.agent.domain.library.ResearchReport
import com.embabel.agent.prompt.PromptUtils
import com.embabel.agent.prompt.ResponseFormat
import com.embabel.agent.prompt.persona.Persona
import com.embabel.common.ai.model.LlmOptions
import com.embabel.common.ai.prompt.PromptContributor
import com.embabel.common.ai.prompt.PromptContributorConsumer
import com.embabel.common.core.types.Timestamped
import org.slf4j.LoggerFactory
import org.springframework.boot.context.properties.ConfigurationProperties
import java.time.Instant
data class SingleLlmReport(
val report: ResearchReport,
val model: String,
) : Timestamped {
override val timestamp: Instant = Instant.now()
}
data class Critique(
val accepted: Boolean,
val reasoning: String,
)
@ConfigurationProperties(prefix = "embabel.examples.researcher")
class ResearcherProperties(
val responseFormat: ResponseFormat = ResponseFormat.MARKDOWN,
val maxWordCount: Int = 300,
val claudeModelName: String = AnthropicModels.CLAUDE_35_HAIKU,
val openAiModelName: String = OpenAiModels.GPT_41_MINI,
val criticModeName: String = OpenAiModels.GPT_41,
val mergeModelName: String = OpenAiModels.GPT_41_MINI,
personaName: String = "Sherlock",
personaDescription: String = "A resourceful researcher agent that can perform deep web research on a topic. Nothing escapes Sherlock",
personaVoice: String = "Your voice is dry and in the style of Sherlock Holmes. Occasionally you address the user as Watson",
personaObjective: String = "To clarify all points the user has brought up",
) : PromptContributorConsumer {
// Create a Persona instance rather than extending it
val persona = Persona(personaName, personaDescription, personaVoice, personaObjective)
override val promptContributors: List<PromptContributor>
get() = listOf(
responseFormat,
persona,
)
}
enum class Category {
QUESTION,
DISCUSSION,
}
data class Categorization(
val category: Category,
)
/**
* Researcher agent that implements the Embabel model for autonomous research.
*
* This agent demonstrates several key aspects of the Embabel framework:
* 1. Multi-model approach - using both GPT-4 and Claude models for research
* 2. Self-critique and improvement - evaluating reports and redoing research if needed
* 3. Parallel execution - running multiple research actions concurrently
* 4. Workflow control with conditions - using satisfactory/unsatisfactory conditions
* 5. Model merging - combining results from different LLMs for better output
*
* The agent follows a structured workflow:
* - First categorizes user input as a question or discussion topic
* - Performs research using multiple LLM models in parallel
* - Merges the research reports from different models
* - Self-critiques the merged report
* - If unsatisfactory, reruns research with specific models
* - Delivers the final research report when satisfactory
*/
@Agent(
description = "Perform deep web research on a topic",
)
class Researcher(
val properties: ResearcherProperties,
) {
private val logger = LoggerFactory.getLogger(Researcher::class.java)
init {
logger.info("Researcher agent initialized: $properties")
}
/**
* Categorizes the user input to determine the appropriate research approach.
* Uses the cheapest LLM model to efficiently classify the input.
*
* @param userInput The user's query or topic for research
* @return Categorization of the input as either a QUESTION or DISCUSSION
*/
@Action
fun categorize(
userInput: UserInput,
context: OperationContext,
): Categorization = context.ai()
.withAutoLlm()
.create(
"""
Categorize the following user input:
Topic:
<${userInput.content}>
""".trimIndent()
)
/**
* Performs research using the GPT-4 model.
* This is one of two parallel research paths (along with Claude).
*
* @param userInput The user's query or topic
* @param categorization The categorization of the input
* @param context The operation context for accessing tools and services
* @return A research report with the GPT-4 model's findings
*/
// These need a different output binding or only one will run
@Action(
post = [REPORT_SATISFACTORY],
canRerun = true,
outputBinding = "gpt4Report",
toolGroups = [CoreToolGroups.WEB, CoreToolGroups.BROWSER_AUTOMATION]
)
fun researchWithGpt4(
userInput: UserInput,
categorization: Categorization,
context: OperationContext,
): SingleLlmReport {
val r = researchWith(
userInput = userInput,
categorization = categorization,
critique = null,
llm = LlmOptions(properties.openAiModelName),
context = context,
)
return r
}
/**
* Redoes research with GPT-4 after receiving an unsatisfactory critique.
* This demonstrates the agent's ability to improve based on feedback.
*
* @param userInput The user's query or topic
* @param categorization The categorization of the input
* @param critique The critique of the previous report explaining why it was unsatisfactory
* @param context The operation context for accessing tools and services
* @return An improved research report with the GPT-4 model's findings
*/
@Action(
pre = [REPORT_UNSATISFACTORY],
post = [REPORT_SATISFACTORY],
canRerun = true,
outputBinding = "gpt4Report",
toolGroups = [CoreToolGroups.WEB, CoreToolGroups.BROWSER_AUTOMATION]
)
fun redoResearchWithGpt4(
userInput: UserInput,
categorization: Categorization,
critique: Critique,
context: OperationContext,
): SingleLlmReport {
val r = researchWith(
userInput = userInput,
categorization = categorization,
critique = critique,
llm = LlmOptions(properties.openAiModelName),
context = context,
)
return r
}
/**
* Common implementation for research with different models.
* Routes to the appropriate research method based on categorization.
*
* @param userInput The user's query or topic
* @param categorization The categorization of the input
* @param critique Optional critique from a previous attempt
* @param llm The LLM options including model selection
* @param context The operation context for accessing tools and services
* @return A research report with the specified model's findings
*/
private fun researchWith(
userInput: UserInput,
categorization: Categorization,
critique: Critique?,
llm: LlmOptions,
context: OperationContext,
): SingleLlmReport {
val researchReport = when (
categorization.category
) {
Category.QUESTION -> answerQuestion(userInput, llm, critique, context)
Category.DISCUSSION -> research(userInput, llm, critique, context)
}
return SingleLlmReport(
report = researchReport,
model = llm.criteria.toString(),
)
}
/**
* Generates a research report that answers a specific question.
* Uses web tools to find precise answers with citations.
*
* @param userInput The user's question
* @param llm The LLM options including model selection
* @param critique Optional critique from a previous attempt
* @param context The operation context for accessing tools and services
* @return A research report answering the question
*/
private fun answerQuestion(
userInput: UserInput,
llm: LlmOptions,
critique: Critique?,
context: OperationContext,
): ResearchReport = context.promptRunner(
llm = llm,
promptContributors = properties.promptContributors,
).create(
"""
Use the web and browser tools to answer the given question.
You must try to find the answer on the web, and be definite, not vague.
Write a detailed report in at most ${properties.maxWordCount} words.
If you can answer the question more briefly, do so.
Including a number of links that are relevant to the topic.
Example:
${PromptUtils.jsonExampleOf<ResearchReport>()}
Question:
<${userInput.content}>
${
critique?.reasoning?.let {
"Critique of previous answer:\n<$it>"
}
}
""".trimIndent()
)
/**
* Generates a research report on a discussion topic.
* Uses web tools to gather information and provide a comprehensive overview.
*
* @param userInput The user's topic for research
* @param llm The LLM options including model selection
* @param critique Optional critique from a previous attempt
* @param context The operation context for accessing tools and services
* @return A research report on the topic
*/
private fun research(
userInput: UserInput,
llm: LlmOptions,
critique: Critique?,
context: OperationContext,
): ResearchReport = context.promptRunner(
llm = llm,
promptContributors = properties.promptContributors,
).create(
"""
Use the web and browser tools to perform deep research on the given topic.
Write a detailed report in ${properties.maxWordCount} words,
including a number of links that are relevant to the topic.
Topic:
<${userInput.content}>
${
critique?.reasoning?.let {
"Critique of previous answer:\n<$it>"
}
}
""".trimIndent()
)
/**
* Evaluates the quality of the merged research report.
* This implements the self-critique capability of the Embabel model.
*
* @param userInput The user's original query or topic
* @param mergedReport The combined report to evaluate
* @return A critique with acceptance status and reasoning
*/
@Action(post = [REPORT_SATISFACTORY], canRerun = true)
fun critiqueMergedReport(
userInput: UserInput,
@RequireNameMatch mergedReport: ResearchReport,
context: OperationContext,
): Critique {
val critique: Critique = context.ai().withLlm(properties.criticModeName)
.create(
"""
Is this research report satisfactory? Consider the following question:
<${userInput.content}>
The report is satisfactory if it answers the question with adequate references.
It is possible that the question does not have a clear answer, in which
case the report is satisfactory if it provides a reasonable discussion of the topic.
${mergedReport.infoString(verbose = true)}
""".trimIndent(),
)
val updatedCritique = critique.copy(accepted = false).copy(reasoning = "Insufficient data")
return updatedCritique
}
/**
* Combines the research reports from different models into a single, improved report.
* This demonstrates the multi-model approach of the Embabel framework.
*
* @param userInput The user's original query or topic
* @param gpt4Report The research report from the GPT-4 model
* @param claudeReport The research report from the Claude model
* @return A merged research report combining the best insights from both models
*/
@Action(
post = [REPORT_SATISFACTORY],
outputBinding = "mergedReport",
canRerun = true,
)
fun mergeReports(
userInput: UserInput,
@RequireNameMatch gpt4Report: SingleLlmReport,
context: OperationContext,
): ResearchReport {
val reports = listOf(
gpt4Report
)
return context.promptRunner(
llm = LlmOptions(properties.criticModeName),
promptContributors = properties.promptContributors,
).create(
"""
Merge the following research reports into a single report taking the best of each.
Consider the user direction: <${userInput.content}>
${reports.joinToString("\n\n") { "Report from ${it.model}\n${it.report.infoString(verbose = true)}" }}
""".trimIndent()
)
}
/**
* Condition that determines if a report is satisfactory.
* Used to control workflow progression.
*
* @param critique The critique of the report
* @return True if the report is accepted as satisfactory
*/
@Condition(name = REPORT_SATISFACTORY)
fun makesTheGrade(
critique: Critique,
): Boolean = critique.accepted
/**
* Condition that determines if a report is unsatisfactory.
* Used to trigger rework of research.
*
* @param critique The critique of the report
* @return True if the report is rejected as unsatisfactory
*/
// TODO should be able to use !
@Condition(name = REPORT_UNSATISFACTORY)
fun rejected(
critique: Critique,
): Boolean = !critique.accepted
/**
* Final action that accepts the research report as the agent's output.
* This marks the successful completion of the research task.
*
* @param mergedReport The final merged research report
* @param critique The positive critique confirming the report is satisfactory
* @return The final research report
*/
@AchievesGoal(
description = "Completes a research or question answering task, producing a research report",
)
// TODO this won't complete without the output binding to a new thing.
// This makes some sense but seems a bit surprising
@Action(pre = [REPORT_SATISFACTORY], outputBinding = "finalResearchReport")
fun acceptReport(
@RequireNameMatch mergedReport: ResearchReport,
critique: Critique,
) = mergedReport
companion object {
/** Condition name for when a report is satisfactory */
const val REPORT_SATISFACTORY = "reportSatisfactory"
/** Condition name for when a report is unsatisfactory */
const val REPORT_UNSATISFACTORY = "reportUnsatisfactory"
}
}