ART icon indicating copy to clipboard operation
ART copied to clipboard

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, and more!

Results 78 ART issues
Sort by recently updated
recently updated
newest added

Traceback (most recent call last): File "/usr/lib/python3.10/multiprocessing/queues.py", line 244, in _feed obj = _ForkingPickler.dumps(obj) File "/usr/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) TypeError: cannot pickle 'SSLContext' object

It is currently possible to exclude large objects like trajectories when pulling models from s3 through the `LocalBackend`. We should do the same for the `SkyPilotBackend`.

RuntimeError occurs when running 2048.ipynb at this link link: https://colab.research.google.com/github/openpipe/art/blob/main/examples/2048/2048.ipynb ``` loading model from .art/2048-multi-turn/models/agent-002/0010 ==((====))== Unsloth 2025.5.1: Fast Qwen2 patching. Transformers: 4.51.3. vLLM: 0.8.5.post1. \\ /| Tesla T4. Num...

One question about ART framework, will we plan to support asynchronous generation/rollout and training, like https://github.com/inclusionAI/AReaL?tab=readme-ov-file (paper: https://arxiv.org/pdf/2505.24298)? Essentially, it is a non-blocking rollout mechanism so that the ready-to-use rollout...

I'm currently working with the ART library on Kaggle and trying to utilize both of the available T4 GPUs. Specifically, I’m experimenting with the Tic-Tac-Toe example and have attempted to...

I'm examining the training implementation in src/art/unsloth/service.py and have a question about the gradient computation approach. Currently, the code processes samples individually: for offset in range(0, packed_tensors["tokens"].shape[0]): # Process single...

Currently there is a function `TrainableModel.delete_checkpoints(best_checkpoint_metric)` which removes all checkpoints except the best and the latest. Unfortunately, there is no straight-forward way to load the weights according to the best...

version: openpipe-art 0.4.4 Notice that in the base class `art.backend.Backend` the functinon `close()` is async. On the other hand, for the `art.local.backend.LocalBackend` backend, the function `close()` is **not async**. As...

Hi, do you happen to have any examples/notebooks of using ART to train a qwen model that can be dropped into langgraph's create_react_agent?

At a first approximation, it's not obvious how to think about choosing a batch size and learning rate. Small batches reduce inference overhead on the GPUs and generally reduce iteration...

documentation