DocShotgun
DocShotgun
It looks like people have managed to get the full versions of the model running, but has anyone had any luck with the 4bit quantized version? https://huggingface.co/OccamRazor/mpt-7b-storywriter-4bit-128g It looks like...
This fork seems to work well for me on 4bit quantized llama models such as the 4bit pyg7b and wizardlb7b, and it is significantly faster than the other up-to-date cuda...
@iwalton3 Commenting out those two lines and rebuilding unfortunately did not fix it. For reference this is the error that gets spit out during inference: ``` Traceback (most recent call...
Cool, resolved by the newest version, thanks!
I haven't been able to get t2i adapters to work in any capacity lol. Trying both the image and sketch adapter yaml files with all of the different t2i adapter...
I also haven't been able to get T2I adapters to work at all, except I'm on windows. It gets to the same point where it loads the model, preprocesses, and...
I had this happen to me when I tried to use my lora on a Q8 GGUF flux. Fixed it by changing `Diffusion in low bits` to `Automatic (fp16 LoRA)`...
If you're trying to run this model in full (16 bit) precision, you're probably running out of memory. You need a lot more memory than just the amount needed to...
I got this same error a few weeks ago trying to train on MI300x with axolotl (on torch 2.4.0+rocm6.1). There was one time I got the training run to start...
Following the logic in the issue linked here (https://github.com/linkedin/Liger-Kernel/issues/231#issuecomment-2336569380), noting that the warp size of AMD Instinct processors is 64 compared to 32 for NVIDIA GPUs, I halved `num_warps` across...