Nondzu
Nondzu
I've same error with TheBloke/galpaca-30B-GPTQ-4bit-128g, @himansharma21 on cpu I see error `RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'`
this issue happens because base model is probably set as float16, like facebook/galactica-30b , but loading over GPTQ require float32 I guess. Edit: link to galactica model data type info...
@Sekoya78 I'll check Pygmalion-6B-dev in free time, can you write your run command for the Pygmalion-6B-dev ?
@Sekoya78 that's so fun because when I run [mayaeary/pygmalion-6b_dev-4bit-128g](https://huggingface.co/mayaeary/pygmalion-6b_dev-4bit-128g) model I still see `RuntimeError: expected scalar type Float but found Half`
awesome, maybe we have different version of text-generation-webui and/or, my webui is 6e19ae4b2f58853ad44adfe8a19672598c62471c
@mariokostelac what about zero1 and zero2? should I also set to false load_in_8bit load_in_4bit ?
@davidmezzetti thank you for help, after small mod this code works fine ```python from transformers import WhisperProcessor from txtai.pipeline import Transcription # from txtai.transcription import Transcription # model = "openai/whisper-large-v2"...