YUTA
YUTA
In my case, --disable-cuda-malloc combined with --lowvram solves the issue at fp8, but not at fp16. The loras made with ai-toolkit are still loaded very very slowly and with extra...
Even if it gets rate limited quickly, I'd love to be able to save some API money by switching to Copilot now and then. Hope this gets added
> > Even if it gets rate limited quickly, I'd love to be able to save some API money by switching to Copilot now and then. Hope this gets added...