pai4451
pai4451
> Thanks for the comment ! Sounds really good for me 💪 > I was planning to open a PR by the beginning of next week to add the link...
> @pai4451 the code has not been released to PyPI [yet](https://pypi.org/project/transformers/#history) - you probably want to use `pip install git+https://github.com/huggingface/transformers.git` to get the `HEAD` that includes this PR. @cnbeining I...
> Hi @pai4451 ! > Thanks a lot for your message! This error is related to `accelerate`, I have run the colab demo this morning and everything seems to work...
Hi @younesbelkada, thank you again for `bitandbytes` integration with `transformers` models. I wonder if it is possible to use a similar way for `DeepSpeed` on int8 quantization with the BLOOM...
> @stas00 Actually I just tested both `bigscience/bloom` and `bigscience/bloom-1b3` without CUDA_LAUNCH_BLOCKING=1 and they both work. This is probably because I pulled newer code from the `bloom-inference` branch of this...
@asaparov Thanks for the details. I can finally inference BLOOM with DeepSpeed on multiple nodes now. However, it only works for `batch_size=1`, and when I increase the batch size, error...
> I get the same error for batch size > 1: > > ``` > gr062: RuntimeError: CUDA error: an illegal memory access was encountered > gr062: terminate called after...
> I am not sure why I am getting the same error ^^ for batch size = 1. @pai4451 > Any pointers? What is your CUDA version and DeepSpeed? I...
@asaparov I tried the inference script with batch sizes = 1, 2, 4, 8, 16, 32, 64 and 128. Only batch sizes equal 1 and 32 work, which is a...
I had the same issue. On my side this illegal memory access error only happens for batch size 2 and 4, but with batch size 8 to 32 I can...