llamafile Added the ability to use LLAMA_HIP

Added the ability to use LLAMA_HIP_UMA

Open Djip007 opened this issue 9 months ago • 7 comments

With AMD APU (like my Ryzen 7940HX) it is possible to use "UMA" to extand VRAM. And in my case I can't alloc more than 4Go of VRAM (bios config).

And with this (https://github.com/ggerganov/llama.cpp/issues/7399) it may be as fast as with VRAM (I can't do a full test because I can't allocate more than 4Go of VRAM with my config)

I can (:crossed_fingers: ) make a PR here but need to know what the best is to made it available.

a runtime option
a failback alloc
a default on some hardware...

May 24 '24 16:05 Djip007

llamafile llamafile copied to clipboard

Added the ability to use LLAMA_HIP_UMA

llamafile
llamafile copied to clipboard