152334H
152334H
This _should_ be solved by https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/98947d173e3f1667eba29c904f681047dea9de90, thanks!
+1 this is deepspeed related on my end, but it's weird because I've heard other people using deepspeed with 8bitadam fine
anyone still working on this....? on the error @prajdabre was mentioning, I find that the problem does not come from a dtype mismatch, but rather a size mismatch. With printf...
Noting that this issue, although stale, remains an issue. Although optimization can run, a functional state dict cannot be saved with 8bitadam. I notice that there is a PR for...
? I do not test via huggingface. I was in fact trying to only use an 8bit optimiser with 32bit weights, though, so I do not experience the int8 flatparameter...
closing this because kavorite's work obsoletes this.
god yes please
> NOTE: In this research preview, we used a modified version of huggingface/transformers library to support multimodal models and the LLaMA tokenizer. Make sure that you are using the correct...
bitsandbytes sounds great. I haven't actually tried LLaVA inference yet, but I see no reason why it shouldn't be hackable to work on just one 3090 (which I accomplished with...
the other important part is the synthetic gpt-4 based dataset.