LLaDA
LLaDA copied to clipboard
Any optimisation tips?
Using DirectML, so might be a bit edge-casey.
Trying to load this model into my ~10gb of VRAM is not going well.
Naturally, trying to get transformers to work with non-cuda devices is abysmal.
Windows 11, Radeon RX 6750XT.
Not really wanting to use CPU speed for this. Should I give up?
If you don’t have access to a GPU, I highly recommend using Google Colab, which provides some free GPU resources for users. Running the model directly on a CPU might not be feasible. Thank you very much for your interest in LLaDA.
sponsors for more