Carnyzzle

Results 4 comments of Carnyzzle

Flash attention 1.x supports Turing, Flash attention 2.x doesn't support Turing as of right now.

> Try this: https://github.com/cinderblocks/radegast/releases/tag/v2.46 > > Note, COF won't show worn on first load if appearance has not loaded,. when I try to access it the page just 404s

I wish for the same, I don't get why there's a Flash Attention 3 for Hopper GPUs no ordinary consumer can get, (yes I know it's technically still in beta),...

And of course right after I posted this they updated the model to GLM 4.6 🤣