ART icon indicating copy to clipboard operation
ART copied to clipboard

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Results 78 ART issues
Sort by recently updated
recently updated
newest added

## Paper Reference [GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning](https://arxiv.org/abs/2507.19457) ## Summary This paper presents GEPA (Genetic Evolution-based Prompt Algorithm), which demonstrates that reflective prompt evolution can outperform traditional...

Project looking interesting and i did see docs section, but i would be much more interested in practical video which can showcase this project power :-)) so if possible do...

## Problem When running test scripts with `enforce_eager=True` specified, the logs still show `enforce_eager=False` and CUDA graphs are being calculated. This makes startup slower and leads to a slower feedback...

Unsloth does not yet support the vLLM V1 engine or multi-device training. A realistic solution is to decouple vLLM for inference and the Unsloth model for training so that we...

Already spoken a little with @bradhilton about this one! Popping here just in case others experience it too and for keeping track. Happens only for multi-gpu. Using Qwen/Qwen3-0.6B on 2x...

Arctic Inference’s Suffix Decoding (AISD) is a speculative-decoding variant that caches repeating suffixes and bulk-verifies them, shaving 2×-6× off raw decoding time and delivering roughly 2×–4× end-to-end speed-ups in vLLM-based...

Was happy to read about ART and excited to use it, but Gemma 3 is unfortunately the only model I want to train, because of its superior multilingual capabilities. So...

I often have to restart a run, either to fix something in my reward function, in response to an OOM or crash that broke training, etc. When I do, by...