localGPT ggml quant cpu mps support

ggml quant cpu mps support

Open imjwang opened this issue 2 years ago • 2 comments

To use MPS, you have to install llama-cpp-python with these env vars:

export CMAKE_ARGS="-DLLAMA_METAL=on"
export FORCE_CMAKE=1

otherwise run this to upgrade from current:

CMAKE_ARGS="-DLLAMA_METAL=on" FORCE_CMAKE=1 pip install -U llama-cpp-python --no-cache-dir

To validate MPS, check if this shows on run: Screenshot 2023-06-28 at 5 10 13 PM

Jun 28 '23 21:06 imjwang

@imjwang can you please check #180. This is going to be the foundation of the new codebase. Do you think it will be possible to combine this PR with that?

Jun 30 '23 06:06 PromtEngineer

@PromtEngineer Yes certainly, let me know what you need. I'm happy to work on it after #180 merges, or let me know how I can help

Jun 30 '23 13:06 imjwang

@imjwang I tested the PR and I think we will go ahead and merge it. Can you please update the Readme as well. We will have to add this functionality to #180 when its ready. Thanks

Jul 04 '23 01:07 PromtEngineer

@PromtEngineer Hey, I just upgraded the Readme. lmk if it's clear

Jul 04 '23 16:07 imjwang

localGPT localGPT copied to clipboard

ggml quant cpu mps support

localGPT
localGPT copied to clipboard