BitNet icon indicating copy to clipboard operation
BitNet copied to clipboard

[BUG] Bitnet Example Bug

Open sneilan opened this issue 1 year ago • 5 comments

Describe the bug When running the example Example of the BitLinear layer from https://github.com/kyegomez/BitNet as of commit 171f4e5 (committed Sun Mar 24 19:48:59 2024 -0700), quoted here for reference

import torch

from bitnet import BitLinear

# Input
x = torch.randn(10, 512)

# BitLinear layer
layer = BitLinear(512, 400)

# Output
y = layer(x)

print(y)

I get the following error

In [1]: import torch
   ...:
   ...: from bitnet import BitLinear
   ...:
   ...: # Input
   ...: x = torch.randn(10, 512)
   ...:
   ...: # BitLinear layer
   ...: layer = BitLinear(512, 400)
   ...:
   ...: # Output
   ...: y = layer(x)
   ...:
   ...: print(y)
2024-03-29 20:06:13.544245: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2024-03-29 20:06:13.564836: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI AVX512_BF16 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-29 20:06:13.939526: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
2024-03-29 20:06:14,366 - numexpr.utils - INFO - Note: NumExpr detected 32 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 8.
2024-03-29 20:06:14,366 - numexpr.utils - INFO - NumExpr defaulting to 8 threads.
/home/sneilan/.gp/scratch/.venv/lib/python3.10/site-packages/transformers/utils/generic.py:441: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
/home/sneilan/.gp/scratch/.venv/lib/python3.10/site-packages/transformers/utils/generic.py:309: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  _torch_pytree._register_pytree_node(
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[1], line 12
      9 layer = BitLinear(512, 400)
     11 # Output
---> 12 y = layer(x)
     14 print(y)

File ~/.gp/scratch/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:1511, in Module._wrapped_call_impl(self, *args, **kwargs)
   1509     return self._compiled_call_impl(*args, **kwargs)  # type: ignore[misc]
   1510 else:
-> 1511     return self._call_impl(*args, **kwargs)

File ~/.gp/scratch/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py:1520, in Module._call_impl(self, *args, **kwargs)
   1515 # If we don't have any hooks, we want to skip the rest of the logic in
   1516 # this function, and just call forward.
   1517 if not (self._backward_hooks or self._backward_pre_hooks or self._forward_hooks or self._forward_pre_hooks
   1518         or _global_backward_pre_hooks or _global_backward_hooks
   1519         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1520     return forward_call(*args, **kwargs)
   1522 try:
   1523     result = None

File ~/.gp/scratch/BitNet/bitnet/bitlinear.py:53, in BitLinear.forward(self, x)
     42 def forward(self, x: Tensor) -> Tensor:
     43     """
     44     Forward pass of the BitLinear layer.
     45
   (...)
     51
     52     """
---> 53     b, s, d = x.shape
     54     w = self.weight
     55     x_norm = RMSNorm(d)(x)

ValueError: not enough values to unpack (expected 3, got 2)

To Reproduce

mkdir scratch
cd scratch
python3 -m venv .venv
source .venv/bin/activate
pip install bitnet
pip uninstall bitnet # to be able to clone repo but leave dependencies
git clone https://github.com/kyegomez/BitNet
cd BitNet
git checkout 171f4e5
ipython
(paste in following code)
import torch
from bitnet import BitLinear
x = torch.randn(10, 512)
layer = BitLinear(512, 400)
y = layer(x)

Expected behavior I expect y to be printed.

Screenshots n/a

Additional context Running Python 3.10.12

Cuda Version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

sneilan avatar Mar 30 '24 03:03 sneilan

@sneilan fixing it now, please excuse me. Try upgrading your bitnet:

pip3 install -U bitnet

kyegomez avatar Mar 30 '24 03:03 kyegomez

Hi, facing the same error even after upgrade. Thank you!

poojadesur avatar Apr 01 '24 02:04 poojadesur

@poojadesur it's been fixed. It was happening because of RMSNorm, so I replaced it with LayerNorm, let me know if it's good.

kyegomez avatar Apr 01 '24 03:04 kyegomez

@poojadesur it's been fixed. It was happening because of RMSNorm, so I replaced it with LayerNorm, let me know if it's good.

Thanks for getting back to me, but I still get the an AttributeError: 'ForwardRef' object has no attribute 'forward_module' when trying to import the package

poojadesur avatar Apr 01 '24 18:04 poojadesur

@poojadesur can you give me the full error?

kyegomez avatar Apr 01 '24 23:04 kyegomez

Stale issue message

github-actions[bot] avatar Jun 01 '24 12:06 github-actions[bot]