Any optimisation tips?

Open ethrx opened this issue 9 months ago • 2 comments

Using DirectML, so might be a bit edge-casey.

Trying to load this model into my ~10gb of VRAM is not going well.

Naturally, trying to get transformers to work with non-cuda devices is abysmal.

Windows 11, Radeon RX 6750XT.

Not really wanting to use CPU speed for this. Should I give up?

Mar 02 '25 18:03 ethrx

If you don’t have access to a GPU, I highly recommend using Google Colab, which provides some free GPU resources for users. Running the model directly on a CPU might not be feasible. Thank you very much for your interest in LLaDA.

Mar 03 '25 01:03 yyyouy

sponsors for more

Mar 03 '25 03:03 jelspace