webarena
webarena copied to clipboard
Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"
Thank you for creating this cool dataset! I would like to use the RL algorithms in TorchRL but that requires that Webarena be written in terms of `TensorDict`. TorchRL supports...
The Docker image for the Reddit environment: postmill-populated-exposed-withimg.tar is built using php 8.1.17, which has a number of known security vulnerabilities: https://www.tenable.com/plugins/nessus/179317 Could you update the image to use php...
* High-level guide on setting up the environment, reset,ting etc * APIs of the environment * Main utility functions for future extensions * Walk through on adding a new environment...
Support palm-2 inference
Addresses an eval bug that causes false positives. **Bug:** Before, the URL `ec2-3-131-244-37.us-east-2.compute.amazonaws.com:7780/admin/reports/report_product/viewedasdf` would get a score of 1.0 for the reference URL `ec2-3-131-244-37.us-east-2.compute.amazonaws.com:7780/admin/reports/report_product/viewed`. **Edit**. This commit requires the URLs...
I find the environment frustratingly slow, it takes around 10 seconds for a single step transition or just calling `env.reset()` once. Profiling tells us A most of the time is...
Implement RCI method proposed in [this](https://arxiv.org/abs/2303.17491) paper. Demo trajectories on task config 300 (`config_files/300.json`) is given here: [gdoc](https://docs.google.com/document/d/1qlwlUy0mc8cTFHdiiCmHxlEydhSmTHYFY2HTUxfhGbI/edit?usp=sharing) RCI consist of the following step: * Explicit RCI 1. Plan: generate...
Change "canlled" to "canceled"; change "cancelled" to "canceled" to reflect spelling in shopping demo site.