bitsandbytes icon indicating copy to clipboard operation
bitsandbytes copied to clipboard

Cannot load it with T5 - RTX 5000, Cuda 11.3

Open Oxi84 opened this issue 1 year ago • 9 comments

When i try:

from transformers import T5ForConditionalGeneration,T5Tokenizer,T5TokenizerFast
model2 = T5ForConditionalGeneration.from_pretrained("3b_m1", device_map='auto' , load_in_8bit=True) 

I get:

TypeError: __init__() got an unexpected keyword argument 'load_in_8bit'

EDIT this error stopped appearing after i restarted the kernel, but now I get this error:

#######################

  /opt/conda/lib/python3.7/site-packages/bitsandbytes/functional.py in get_colrow_absmax(A, row_stats, col_stats,    nnz_block_ptr, threshold)

1494 prev_device = pre_call(A.device) 1495 is_on_gpu([A, row_stats, col_stats, nnz_block_ptr]) -> 1496 lib.cget_col_row_stats(ptrA, ptrRowStats, ptrColStats, ptrNnzrows, ct.c_float(threshold), rows, cols) 1497 post_call(prev_device) 1498

/opt/conda/lib/python3.7/ctypes/init.py in getattr(self, name) 375 if name.startswith('') and name.endswith(''): 376 raise AttributeError(name) --> 377 func = self.getitem(name) 378 setattr(self, name, func) 379 return func

/opt/conda/lib/python3.7/ctypes/init.py in getitem(self, name_or_ordinal) 380 381 def getitem(self, name_or_ordinal): --> 382 func = self._FuncPtr((name_or_ordinal, self)) 383 if not isinstance(name_or_ordinal, int): 384 func.name = name_or_ordinal

AttributeError: /opt/conda/lib/python3.7/site-packages/bitsandbytes/libbitsandbytes_cpu.so: undefined symbol:     cget_col_row_stats

import transformers print(transformers.version) 4.22.0.dev0

GPU: RTX 5000

!conda list | grep cudatoolkit cudatoolkit 11.3.1

Oxi84 avatar Aug 18 '22 22:08 Oxi84

I'm having the same problem (can't find cget_col_row_stats) when using an A6000.

fozziethebeat avatar Aug 19 '22 05:08 fozziethebeat

On other cards it works well?

Oxi84 avatar Aug 19 '22 19:08 Oxi84

I have RTX 3060 and get the same error.

z80maniac avatar Aug 20 '22 12:08 z80maniac

I tried upgrading cuda to 11.6 (and pytorch to match) and I still get the same error. Having looked at the code I'm guessing some #DEFINE didn't get included in the shared library. At some point i'll try building the package myself and install.

fozziethebeat avatar Aug 22 '22 23:08 fozziethebeat

I have got the same issue. Any solutions?

parastooAflaki avatar Aug 25 '22 16:08 parastooAflaki

Can you please provide the output of python -m bitsandbytes. It seems that your CUDA driver is not detected, and as such, no GPU is visible to the bnb cuda setup. This causes the CPU library to be loaded, which does not have the functions that you are trying to use.

In a new version of bitsandbytes the error message is a bit more meaningful, but it would still be useful to figure out what happened in your case. I suspect it is the same error as in #17.

TimDettmers avatar Sep 05 '22 22:09 TimDettmers

After some fixes, my situation is a bit confusing. I'm running Jupyter in a docker container. When running python -m bitsandbytes in a jupyter shell, I get the following:

UDA SETUP: CUDA runtime path found: /opt/conda/lib/libcudart.so
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary /opt/conda/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cuda117.so...
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++ DEBUG INFORMATION +++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++ POTENTIALLY LIBRARY-PATH-LIKE ENV VARS ++++++++++
'CONDA_EXE': '/opt/conda/bin/conda'
'VIRTUAL_PATH': '/jupyter'
'SUDO_COMMAND': '/opt/conda/bin/jupyter lab --NotebookApp.iopub_data_rate_limit=1.0e10 --NotebookApp.base_url=/jupyter --port=80'
'JULIA_PKGDIR': '/opt/julia'
'GSETTINGS_SCHEMA_DIR': '/opt/conda/share/glib-2.0/schemas'
'CONDA_PREFIX': '/opt/conda'
'JUPYTER_SERVER_URL': 'http://51b96df1ea41:80/jupyter/'
'RSTUDIO_WHICH_R': '/opt/conda/bin/R'
'XDG_CACHE_HOME': '/home/jovyan/.cache'
'JUPYTER_SERVER_ROOT': '/home/jovyan'
'PYTHONPATH': '/usr/local/spark/python/lib/py4j-0.10.9.3-src.zip:/usr/local/spark/python:'
'CONDA_DIR': '/opt/conda'
'SPARK_HOME': '/usr/local/spark'
'JULIA_DEPOT_PATH': '/opt/julia'
'CONDA_PYTHON_EXE': '/opt/conda/bin/python'
'SPARK_CONF_DIR': '/usr/local/spark/conf'
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

WARNING: Please be sure to sanitize sensible info from any such env vars!

++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
COMPILED_WITH_CUDA = True
COMPUTE_CAPABILITIES_PER_GPU = ['8.6']
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Running a quick check that:
    + library is importable
    + CUDA function is callable

SUCCESS!
Installation was successful!

But, when I run the same command in a notebook, i get the following:

WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/jupyter')}
WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('/opt/conda/bin/jupyter lab --NotebookApp.iopub_data_rate_limit=1.0e10 --NotebookApp.base_url=/jupyter --port=80')}
WARNING: The following directories listed in your path were found to be non-existent: {PosixPath('module'), PosixPath('//matplotlib_inline.backend_inline')}
CUDA_SETUP: WARNING! libcudart.so not found in any environmental path. Searching /usr/local/cuda/lib64...
WARNING: No libcudart.so found! Install CUDA or the cudatoolkit package (anaconda)!
CUDA SETUP: Loading binary /opt/conda/lib/python3.10/site-packages/bitsandbytes/libbitsandbytes_cpu.so...
/opt/conda/lib/python3.10/site-packages/bitsandbytes/cextension.py:48: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
  warn(
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++ DEBUG INFORMATION +++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++ POTENTIALLY LIBRARY-PATH-LIKE ENV VARS ++++++++++
'VIRTUAL_PATH': '/jupyter'
'SUDO_COMMAND': '/opt/conda/bin/jupyter lab --NotebookApp.iopub_data_rate_limit=1.0e10 --NotebookApp.base_url=/jupyter --port=80'
'JULIA_PKGDIR': '/opt/julia'
'XDG_CACHE_HOME': '/home/jovyan/.cache'
'PYTHONPATH': '/usr/local/spark/python/lib/py4j-0.10.9.3-src.zip:/usr/local/spark/python:'
'CONDA_DIR': '/opt/conda'
'SPARK_HOME': '/usr/local/spark'
'JULIA_DEPOT_PATH': '/opt/julia'
'MPLBACKEND': 'module://matplotlib_inline.backend_inline'
'SPARK_CONF_DIR': '/usr/local/spark/conf'
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

WARNING: Please be sure to sanitize sensible info from any such env vars!

++++++++++++++++++++++++++ OTHER +++++++++++++++++++++++++++
COMPILED_WITH_CUDA = False
COMPUTE_CAPABILITIES_PER_GPU = ['8.6']
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++ DEBUG INFO END ++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Running a quick check that:
    + library is importable
    + CUDA function is callable

name 'str2optimizer32bit' is not defined

I'm still trying to debug why the notebook version isn't picking up the same libraries as the shell version.

fozziethebeat avatar Sep 06 '22 08:09 fozziethebeat

I managed to fix my personal situation. I'm pretty sure it's something weird with how I'm custom building my GPU enabled jupyter image. For reasons unknown to me I have libcudart libraries installed in two places:

/opt/conda/lib/libcudart.so -> libcudart.so.11.7.60

And

/usr/local/cuda/lib64/libcudart.so.11.0 -> libcudart.so.11.6.55

I made a symlink from

/usr/local/cuda/lib64/libcudart.so -> libcudart.so.11.6.55

Then I had to make a few changes. Looking through the transformers history, I made sure to install the right version with

pip install transformers==4.21.3

Then, I changed the original jupyter notebook from Google Colab to load the pipeline with the following lines:

from transformers import pipeline

pipe = pipeline(model=name, 
                load_in_8bit=True,
                model_kwargs= {"device_map": "auto"}, 
                max_new_tokens=max_new_tokens)

Leaving load_in_8bit as a model_kwargs broke due to some change deep in transformers.

fozziethebeat avatar Sep 06 '22 08:09 fozziethebeat

It seems that your CUDA driver is not detected

Yes, after I installed the CUDA Toolkit the error went away (in my case). Thank you!

z80maniac avatar Sep 17 '22 13:09 z80maniac

I believe this is fixed in the latest version. It prints instructions on how to debug the situation and alternatively prints out compilation instructions which should fix the issue.

TimDettmers avatar Oct 27 '22 14:10 TimDettmers