localGPT
localGPT copied to clipboard
ggml quant cpu mps support
To use MPS, you have to install llama-cpp-python with these env vars:
export CMAKE_ARGS="-DLLAMA_METAL=on"
export FORCE_CMAKE=1
otherwise run this to upgrade from current:
CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir
To validate MPS, check if this shows on run:
@imjwang can you please check #180. This is going to be the foundation of the new codebase. Do you think it will be possible to combine this PR with that?
@PromtEngineer Yes certainly, let me know what you need. I'm happy to work on it after #180 merges, or let me know how I can help
@imjwang I tested the PR and I think we will go ahead and merge it. Can you please update the Readme as well. We will have to add this functionality to #180 when its ready. Thanks
@PromtEngineer Hey, I just upgraded the Readme. lmk if it's clear