[BUG] Bitnet Example Bug
Describe the bug
When running the example Example of the BitLinear layer from https://github.com/kyegomez/BitNet as of commit 171f4e5 (committed Sun Mar 24 19:48:59 2024 -0700), quoted here for reference
import torch
from bitnet import BitLinear
# Input
x = torch.randn(10, 512)
# BitLinear layer
layer = BitLinear(512, 400)
# Output
y = layer(x)
print(y)
I get the following error
In [1]: import torch
...:
...: from bitnet import BitLinear
...:
...: # Input
...: x = torch.randn(10, 512)
...:
...: # BitLinear layer
...: layer = BitLinear(512, 400)
...:
...: # Output
...: y = layer(x)
...:
...: print(y)
2024-03-29 20:06:13.544245: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-03-29 20:06:13.564836: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-29 20:06:13.939526: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-03-29 20:06:14,366 - numexpr.utils - INFO - Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-03-29 20:06:14,366 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
/home/sneilan/.gp/scratch/.venv/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
/home/sneilan/.gp/scratch/.venv/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
_torch_pytree._register_pytree_node(
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[1], line 12
9 layer = BitLinear(512, 400)
11 # Output
---> 12 y = layer(x)
14 print(y)
File ~/.gp/scratch/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
1509 return self._compiled_call_impl(*args, **kwargs) # type: ignore[misc]
1510 else:
-> 1511 return self._call_impl(*args, **kwargs)
File ~/.gp/scratch/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
1515 # If we don't have any hooks, we want to skip the rest of the logic in
1516 # this function, and just call forward.
1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
1518 or _global_backward_pre_hooks or _global_backward_hooks
1519 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520 return forward_call(*args, **kwargs)
1522 try:
1523 result = None
File ~/.gp/scratch/BitNet/bitnet/bitlinear.py:53, in BitLinear.forward(self, x)
42 def forward(self, x: Tensor) -> Tensor:
43 """
44 Forward pass of the BitLinear layer.
45
(...)
51
52 """
---> 53 b, s, d = x.shape
54 w = self.weight
55 x_norm = RMSNorm(d)(x)
ValueError: not enough values to unpack (expected 3, got 2)
To Reproduce
mkdir scratch
cd scratch
python3 -m venv .venv
source .venv/bin/activate
pip install bitnet
pip uninstall bitnet # to be able to clone repo but leave dependencies
git clone https://github.com/kyegomez/BitNet
cd BitNet
git checkout 171f4e5
ipython
(paste in following code)
import torch
from bitnet import BitLinear
x = torch.randn(10, 512)
layer = BitLinear(512, 400)
y = layer(x)
Expected behavior I expect y to be printed.
Screenshots n/a
Additional context Running Python 3.10.12
Cuda Version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0
@sneilan fixing it now, please excuse me. Try upgrading your bitnet:
pip3 install -U bitnet
Hi, facing the same error even after upgrade. Thank you!
@poojadesur it's been fixed. It was happening because of RMSNorm, so I replaced it with LayerNorm, let me know if it's good.
@poojadesur it's been fixed. It was happening because of RMSNorm, so I replaced it with LayerNorm, let me know if it's good.
Thanks for getting back to me, but I still get the an AttributeError: 'ForwardRef' object has no attribute 'forward_module' when trying to import the package
@poojadesur can you give me the full error?
Stale issue message