cieske
cieske
python 3.10 transformer 4.38.2 bitsandbytes 0.42.0 accelerate 0.27.2 torch 2.0.1+cu117 torch.cuda.is_available() works, nvidia-smi works, import accelerate and bitsandbytes works, but 8-bit quantization doesn't works with same error
In my case using Mistral Instruct model, input with proper template and setting max_tokens from SamplingParams helps.
+1 I also wonder if there is any reference for choosing an appropriate calibration dataset.
I think this partial_fit function looks really good and looking forward to using this in scikit-learn soon. Thanks for your hard work:)
@adrinjalali Sorry for late reply and thanks for reasonable suggestion!
> This OpenaiChatCompletionsLM has not been defined and needs to be rewritten? [@bcarvalho-via](https://github.com/bcarvalho-via) AFAIK, openai api & axure openai api are slightly different, just rewrite above part will make compatiblility...