LiteWebAgent [roadmap] architecture design and roadmap

Aug 06 '24 03:08 TataKKKL

Decouple agent action planning/generation and agent action grounding.

Agent action planning/generation updates the plan based on the goal and current progress, described in natural language.
- The simplest version of agent action planning/ generation, as seen in https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/DemoAgent.py, involves saving the results of action generation and action execution. These results are stored as a list of messages, which is then passed to the LLM (Language Learning Model). The LLM uses this information to decide on the next action or to determine if it should stop.
- https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/HighLevelPlanningAgent.py
- https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/ContextAwarePlanningAgent.py
- https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/MCTSAgent.py
- https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/agents/MCTSrAgent.py
Then, agent action grounding translates the natural language into Playwright code, which we execute.
- we currently reuse browsergym action grounding mechanics: https://github.com/PathOnAI/LiteWebAgent/blob/main/litewebagent/webagent.py
- we allow using different combination of features in the action grounding

Aug 16 '24 23:08 TataKKKL

action function agent

Aug 17 '24 08:08 TataKKKL

https://www.overleaf.com/project/66b7d45f9adf624e2702f75e

Aug 17 '24 18:08 TataKKKL