HRM icon indicating copy to clipboard operation
HRM copied to clipboard

Add Reasoning-Gym Experiments

Open Miserlou opened this issue 5 months ago • 3 comments

Hello, team! Congratulations on producing such an excellent and novel architecture.

I'm a contributor to a project called Reasoning Gym, which has difficulty-adjustable dataset generators for more than 100 types of reasoning tasks (math, logic, games, etc.)

We're very interested to see if HRM can solve some of the harder tasks that other LLMs struggle with, and would like to run some experiments with our dataset generators and your model.

Is this an 'active' GitHub project - would you accept PRs if I added support for Reasoning-Gym to this repo?

Thanks so much, Rich

Miserlou avatar Jul 27 '25 14:07 Miserlou

Thanks for your recommendation! We welcome PRs and are happy to collaborate! I see that many Reasoning Gym tasks are trained with RL, and HRM indeed supports RL. It may indeed require some tuning and handling with sparse reward.

imoneoi avatar Jul 29 '25 06:07 imoneoi

Please post results once complete. ETA?

reh3376 avatar Aug 05 '25 00:08 reh3376

I looked into this just a little bit. Some tasks I think could maybe work with the current HRM implementation:

n_queens pool_matrix rearc rectangle_count (if modified) rotate_matrix rotten_oranges shortest_path sokoban spiral_matrix survo tsumego

The one I'm most interested in right now is rubiks_cube, but that requires some knowledge that the model doesn't have. In fact, most of them require some kind of language understanding, base knowledge and instruction tuning.

I wonder what the plan roadmap is for the greater HRM project? Is there a plan to do a big general pretrain with a tokenizer and make a base model that we can instruction-tune and interact with the way we do with the LLMs we know today? Or is HRM just for small, single-purpose models which operate on structured dataframe problems?

Miserlou avatar Aug 05 '25 08:08 Miserlou