One
One
It's working in progress now 😃
Thanks! Will fix in next revision.
Thanks for your recommendation! We welcome PRs and are happy to collaborate! I see that many Reasoning Gym tasks are trained with RL, and HRM indeed supports RL. It may...
As different GPU hardware require various versions of CUDA, PyTorch, etc. Following most repositories like FlashAttention, Transformers, we don't fix version unless strictly necessary, to minimize incompatibilities during installation.
Since batched inference with ACT is complex and requires dynamically scheduling multiple sequences, in this repository we provide the simplest version that runs to the maximum number of steps, as...
Thanks for the PR! 99.3% seems amazing, it's much higher than reported in our paper. Is that result from `evaluate.py`? I can't find the result of evaluation cell in the...
C-RLFT is weighted C-SFT. You can use the "weight" field in the training data format to set up weights for different conditions.
Hi @tridao, have you had a chance to take another look at this PR? This is a much-needed feature for FA3 when used with `torch.compile`.
We've run the handcrafted Sudoku set from https://github.com/SakanaAI/Sudoku-Bench The released 1000 example checkpoint achieved 92%. The following is complete solution process of Sudoku-Bench. [sudoku-nikoli.pdf](https://github.com/user-attachments/files/21619760/sudoku-nikoli.pdf)