Thibault LSDC issues

Results 10 issues of


                                            Thibault LSDC

Dev branch for the ToolUseAgent

Comes in combination with this bgym PR: https://github.com/ServiceNow/BrowserGym/pull/340 ## Description by Korbit AI ### What change is being made? Introduce a new `ToolUseAgent` and supporting benchmark data, and replace existing...

Adding a simple debug agent to manually test actions

## Description by Korbit AI ### What change is being made? Add a new debug agent in `debug_agent.py` for manual testing of actions within the browser gym environment. ### Why...

Update test_study.py

## Description by Korbit AI ### What change is being made? Add a pytest skip marker to the `test_launch_parallel_study` test case, with a reason indicating the use case is not...

Make a bash file to install each benchmark

Test needs rework to not use potentially outdated models

https://github.com/ServiceNow/AgentLab/blob/a228d4105047bb27fcc24e61d626e476a586572f/tests/llm/test_tracking.py#L45-L52

invalid

Prepare all SequentialStudies

If a SequentialStudies breaks in the middle of a Study, the ones after cannot be relaunched, as the files were never created in a .prepare().

bug

help wanted

Thibault LSDC

Dev branch for the ToolUseAgent

Adding a simple debug agent to manually test actions

Update test_study.py

Make a bash file to install each benchmark

Test needs rework to not use potentially outdated models

Prepare all SequentialStudies

adding resample benchmark objects

adding get_version method to Benchmark

PR Template for benchmark creation

adding get_version method