ART icon indicating copy to clipboard operation
ART copied to clipboard

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Results 78 ART issues
Sort by recently updated
recently updated
newest added

### Changes * Add section on docs in CONTRIBUTING.md

## Proposal Hello OpenPipe Team! Thanks for creating such a cool project for agent RL. I recently saw the [LangGraph integration announcement](https://github.com/OpenPipe/ART/pulls?q=is%3Apr+langgraph+is%3Aclosed) and tested it, and it works great :)...

I try [2048](https://github.com/OpenPipe/ART/tree/main/examples/2048) example and only one scores in the end (step 27), other is always `{'scores': []}` ```bash [RULER] Pretty-printed LLM choice JSON: {'scores': []} Swallowed exception: Skipping tuning...

ERROR:asyncio:Exception in callback _log_task_completion(error_callback=>)() at /home/ubuntu/.venv/lib/python3.12/site-packages/vllm/engine/async_llm_engine.py:46> Traceback (most recent call last): File "/home/ubuntu/.venv/lib/python3.12/site-packages/vllm/engine/async_llm_engine.py", line 56, in _log_task_completion return_value = task.result() ^^^^^^^^^^^^^ File "/home/ubuntu/.local/share/uv/python/cpython-3.12.11-linux-x86_64-gnu/lib/python3.12/asyncio/futures.py", line 202, in result raise self._exception.with_traceback(self._exception_tb) File...

We basically implemented our trainer based on https://github.com/OpenPipe/ART/blob/5a60fa017ab876910bbec61add43f81ef4103eb9/dev/art-e/art_e/train.py. Essentially, the same code skeleton for local vllm hosting. However, during the experiment, one werid issue raised when setting groups_per_step = 50/40:...

Currently it's impossible to launch several `LocalBackend(in_process=False)` sessions simultaneously on different processes. The current implementation of the `LocalBackend` performs: `"pkill -9 model-service"` which kills the "model-service" **systemwide**. As result, if...

Several tasks are helped with an initial SFT step before RL. It would be good for ART to support that directly so that folks can create programmatic pipelines that can...