HRM icon indicating copy to clipboard operation
HRM copied to clipboard

Fix: Replace nn.Buffer with register_buffer

Open Tar-ive opened this issue 5 months ago • 1 comments

https://github.com/sapientinc/HRM/issues/28

The core issue was a bug in the original HRM source code that made it incompatible with modern versions of PyTorch. The error message AttributeError: module 'torch.nn' has no attribute 'Buffer' told us that the code was trying to use a feature in a way that doesn't exist.

The Cause: Incorrect PyTorch Usage In PyTorch, a "buffer" is a tensor that is part of a model's state (like weights) but is not a parameter that gets updated during training (e.g., a running mean in a normalization layer).

The original developer wrote code like this: self.weights = nn.Buffer(...)

This is incorrect. nn.Buffer is not a function you can call directly to create a buffer. This might have worked in a very old, pre-release version of PyTorch, but it is not the correct way to do it.

The Change: Using the Correct Method The official and correct way to create and register a buffer in a PyTorch model is by using the self.register_buffer() method.

We fixed the code by changing lines like the one above to the following pattern:

# 1. Create the tensor you want to be a buffer
weights_tensor = trunc_normal_init_(...)

# 2. Register it as a buffer using the correct method
self.register_buffer('weights', weights_tensor)
We had to apply this same logical fix in three different files because the original developer repeated this same coding mistake throughout the repository:

models/sparse_embedding.py

models/layers.py

models/hrm/hrm_act_v1.py

By making these changes, we made the code compliant with the modern PyTorch API, which allowed the training to proceed without errors.

Tar-ive avatar Aug 05 '25 05:08 Tar-ive

I've applied it here: https://github.com/deepstupid/hrem/commit/45a3b2c732317f4136711874471372f2dd53536a

automenta avatar Aug 25 '25 14:08 automenta