NOFRAUD Latam
NOFRAUD Latam
So, is it a bug in the VRAM memory allocation or is it the expected behavior ?
Thank you @orlyandico, I was able to increase by 2 the number of layers deployed in the VRAM GPU using Mixtral 3BitQ (from 29/33 to 31/33). I gained some performance...
This would be awesome, particularly when you need a specific output like "yes or no".
I made a little modification to this script in order to set automatic scroll-down in the message area just after the dot animation (the original script fails to do that...