'TASK' microagents
What problem or use case are you trying to solve?
Microagents of type 'TASK' are started in the codebase, but not actually usable / implemented yet.
Describe the UX of the solution you'd like
Re: https://github.com/All-Hands-AI/OpenHands/issues/6713#issuecomment-2727572940
One way for the LLM to keep track of the task may be:
- prompt it to plan its task or restate the user's task as a plan
- make itself a .md file when it records it
- with checkboxes to check when each item is done
- (maybe) with an review step at the end.
Do you have thoughts on the technical implementation?
- express it in a microagent with type 'TASK' to include in the prompt
Describe alternatives you've considered
- we've all done it in some form; people report success at least for some types of long-running tasks where the steps are clear beforehand (please see linked comment above, for example)
Additional context It just seems to me one of the natural next steps after this issue, so just recording it here
- https://github.com/All-Hands-AI/OpenHands/pull/6909
If you find this feature request or enhancement useful, make sure to add a 👍 to the issue
I was helping a friend migrate a project and used a migration file with steps to run OpenHands through. Something thats easy to check steps and come back to seemed helpful.
A potential fix has been generated and a draft PR #7363 has been created. Please review the changes.
One way for the LLM to keep track of the task may be:
- prompt it to plan its task or restate the user's task as a plan
- make itself a .md file when it records it
- with checkboxes to check when each item is done
- (maybe) with an review step at the end.
I think another way might be to define a custom tool for the agent, e.g. workflow_tracker, with multiple commands like rephrase_plan to convert the workflow into a structured form, and then tick to mark a subtask as done and perhaps list_status to print the status of all tasks in the whole workflow 🤔 Given that the LLM is trained to call tool, this might be more reliable than nudging to use a md file to keep track of task states.
This is definitely a possibility!
We could also try both and see what works.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
At a very high level, is the goal here that: A user can ask OpenHands to generate a plan for a given prompt, OpenHands will generate it as an .md file in the workspace, and then check-off tasks as it goes? I imagine we'd want a UI feature to show the agent checking-off tasks as it goes.
Yes, with the observation that, following the Planning PR, it's not just a one-time plan for a single prompt, but
- successive plans, possibly, until the user ends the Planning and starts execution mode.
Perhaps I should note also the consequence:
- the user will be able to stop executing and return control to the planner, so our prompts need to keep the LLM on track for re-planning. The LLMs are very able to do this themselves these days, unlike a year ago 😄, but we still need user-friendliness.
This issue is stale because it has been open for 40 days with no activity. Remove the stale label or leave a comment, otherwise it will be closed in 10 days.
I didn't notice we have it with the planning agent issue as parent issue, sorry.
I think from another point of view, maybe we could consider this solved by the task tool.
@enyst Thanks for flagging this. Based on the original description, it does feel like the Task tool addresses this. Safe to close, or do you see any additional scope we should capture?
You're right, I think it addresses it!
I had a comment above about re-planning, that I felt wasn't quite covered ("our prompts need to keep the LLM on track for re-planning"), but I take that back! OpenHands-GPT-5 replans on the fly with the Task tool, and without, it doesn't even blink.
Maybe we can think more about user-directed (re)planning in the Planning Agent issue, not here.
I think we'll know more when they both, task tool and planning, spend some time out there.
For now, I feel there was an element of... hardcoded workflow in my view on this issue, and GPT-5 blew it out of the water. No need to hardcode workflows when the LLM is capable of ... just doing it!
(aka the bitter lesson under another name 😂)