152334H

Results 54 comments of 152334H

This _should_ be solved by https://github.com/AUTOMATIC1111/stable-diffusion-webui/commit/98947d173e3f1667eba29c904f681047dea9de90, thanks!

+1 this is deepspeed related on my end, but it's weird because I've heard other people using deepspeed with 8bitadam fine

anyone still working on this....? on the error @prajdabre was mentioning, I find that the problem does not come from a dtype mismatch, but rather a size mismatch. With printf...

Noting that this issue, although stale, remains an issue. Although optimization can run, a functional state dict cannot be saved with 8bitadam. I notice that there is a PR for...

? I do not test via huggingface. I was in fact trying to only use an 8bit optimiser with 32bit weights, though, so I do not experience the int8 flatparameter...

closing this because kavorite's work obsoletes this.

> NOTE: In this research preview, we used a modified version of huggingface/transformers library to support multimodal models and the LLaMA tokenizer. Make sure that you are using the correct...

bitsandbytes sounds great. I haven't actually tried LLaVA inference yet, but I see no reason why it shouldn't be hackable to work on just one 3090 (which I accomplished with...

the other important part is the synthetic gpt-4 based dataset.