Forkoz comments

Repositories
Issues
Comments

Results 474 comments of


                                            Forkoz

Can't load GGUF model

GGUF models are only faster for me with nvlink patch and fully offloaded. Q5 going to need some CPU unless you have 3 cards and that's gonna be slow.

Can't load GGUF model

70b doesn't fit on 1 4090 so half (or more) of it is on CPU.

Document Prerequisites: Hardware and Software

The readme was pretty good.. in terms of hardware you're on your own. Can use small models on 1 gpu or big models. Am testing it with 103-120b @ 16k...

flux1-dev-Q8_0.gguf

Stuff like new forge and comfyui do. Also, they only use the GGUF format.