langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Add support for llama-cpp GPU acceleration

Open moejay opened this issue 1 year ago • 3 comments

Add n_gpu_layers param to llama-cpp that allows some processing on the GPU

This is in preparation for added GPU support in llama-cpp https://github.com/abetlen/llama-cpp-python/pull/203/ for when is approved and merged

This is the llama-cpp commit that this PR adds support for

Before submitting

  • No New integration

Who can review?

  • @hwchase17
  • @agola11

Thanks all for the wonderful project!

moejay avatar May 13 '23 20:05 moejay

Is there an easy way to test all of the functionality in one go or there is quite a lot of chained merges below?

m0sh1x2 avatar May 14 '23 14:05 m0sh1x2

Is there an easy way to test all of the functionality in one go or there is quite a lot of chained merges below?

not that I can think of (This is dependent on one PR though) so maybe it's not too bad to test

you need the llama-cpp-python binding to be updated, and for that you could pip install from https://github.com/moejay/llama-cpp-python (That's my branch with this updated)

I can add some more detailed testing instructions in the PR description when I come back later this evening if that helps.

I did notice that a bunch of the examples were broken (There are probably issues for those that exist at the moment, maybe I'll be able to contribute some fixes there later on)

moejay avatar May 14 '23 17:05 moejay

Looks like the same change was merged a bit later here So, this should now work with the latest llama-cpp-bindings

moejay avatar May 15 '23 03:05 moejay

duplicate of #4739

dev2049 avatar May 17 '23 00:05 dev2049