LiteWebAgent
LiteWebAgent copied to clipboard
[roadmap] architecture design and roadmap
Decouple agent action planning/generation and agent action grounding.
- Agent action planning/generation updates the plan based on the goal and current progress, described in natural language.
- The simplest version of agent action planning/ generation, as seen in https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/DemoAgent.py, involves saving the results of action generation and action execution. These results are stored as a list of messages, which is then passed to the LLM (Language Learning Model). The LLM uses this information to decide on the next action or to determine if it should stop.
- https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/HighLevelPlanningAgent.py
- https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/ContextAwarePlanningAgent.py
- https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/MCTSAgent.py
- https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/MCTSrAgent.py
- Then, agent action grounding translates the natural language into Playwright code, which we execute.
- we currently reuse browsergym action grounding mechanics: https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/webagent.py
- we allow using different combination of features in the action grounding
action function agent
https://www.overleaf.com/project/66b7d45f9adf624e2702f75e