r4dm
r4dm
same problem, please add support for vocab_size over 32000. in my case need vocab_size 60256
`cp libbitsandbytes_cuda117.so libbitsandbytes_cpu.so` This method worked for me in WSL Ubuntu on Windows11. I installed anacoda3 and cuda118 special version for `WSL2` FineTuning Vicuna-13B with QloRA works well on RTX4090
https://github.com/casper-hansen/AutoAWQ/issues/558
the same problem. version 0.2.6 installs transformers 4.43.3 which gives an error during quantization. In this case, it is the quantization code that does not work, but the inference code...
the solution works only with llama 3, not 3.1
This also did not help in my case. I quantize the 70b model on one A100 and with default settings this used to happen normally. With new version autoawq and...
temporary solution works only with llama 3, not 3.1 Because support for 3.1 was added in transformers v4.43.0