text-generation-webui-colab
text-generation-webui-colab copied to clipboard
new to *free* google colab
Being on an NVIDIA T4, Is it possible to utilize xformers, and use exllamav2 as the loader for (mistral flavor of your choice)GPTQ 4bit 32gs ... I have a feeling it would perform blazingly fast with minimal degradation and great context... But you've spent more time on this...