Fix: Replace nn.Buffer with register_buffer
https://github.com/sapientinc/HRM/issues/28
The core issue was a bug in the original HRM source code that made it incompatible with modern versions of PyTorch. The error message AttributeError: module 'torch.nn' has no attribute 'Buffer' told us that the code was trying to use a feature in a way that doesn't exist.
The Cause: Incorrect PyTorch Usage In PyTorch, a "buffer" is a tensor that is part of a model's state (like weights) but is not a parameter that gets updated during training (e.g., a running mean in a normalization layer).
The original developer wrote code like this: self.weights = nn.Buffer(...)
This is incorrect. nn.Buffer is not a function you can call directly to create a buffer. This might have worked in a very old, pre-release version of PyTorch, but it is not the correct way to do it.
The Change: Using the Correct Method The official and correct way to create and register a buffer in a PyTorch model is by using the self.register_buffer() method.
We fixed the code by changing lines like the one above to the following pattern:
# 1. Create the tensor you want to be a buffer
weights_tensor = trunc_normal_init_(...)
# 2. Register it as a buffer using the correct method
self.register_buffer('weights', weights_tensor)
We had to apply this same logical fix in three different files because the original developer repeated this same coding mistake throughout the repository:
models/sparse_embedding.py
models/layers.py
models/hrm/hrm_act_v1.py
By making these changes, we made the code compliant with the modern PyTorch API, which allowed the training to proceed without errors.
I've applied it here: https://github.com/deepstupid/hrem/commit/45a3b2c732317f4136711874471372f2dd53536a