KoboldAI-Client
KoboldAI-Client copied to clipboard
Can we have 8bit support?
trafficstars
See https://gist.github.com/whjms/2505ef082a656e7a80a3f663c16f4277
If this were to be added and was usable on Colab then you could load up to like 13B models on a standard GPU. That way you wouldn't need to use TPUs to run bigger models since well they don't work atm. (using oobabooga's colab won't work on standard GPUs since it loads up the shards to the RAM and it would run out of memory but KoboldAI shouldn't have that problem)