Andrei
Andrei
@slavakurilyak You can currently run Vicuna models using LlamaCpp if you're okay with CPU inference (I've tested both 7b and 13b models and they work great). Currenty there is no...
@Qualzz haven't had time to put together a PR but if you want to give me a hand I think all you have to do is something similar to https://github.com/hwchase17/langchain/blob/master/langchain/llms/anthropic.py#L205...
> the llama cpp python bindings doesn't return back until the response has finished generating. short of modifying the underlying llama.cpp and python bindings, you could pass the prompt to...
Hi @Zetaphor are you referring to this [Llama demo](https://python.langchain.com/en/latest/modules/models/llms/integrations/llamacpp.html)? I'm the author of the `llama-cpp-python` library, I'd be happy to help. Can you give me an idea of what kind...
Thanks for the reply, based on that I think it's [related to this issue](https://github.com/abetlen/llama-cpp-python/issues/19). I've opened a PR (#2411) to give you control over the batch size from LangChain (this...
@hershkoy I've not seen that floating point exception before but this is using a different library, I suspect it might be a bug with `n_ctx` being too large maybe? Just...
@hershkoy absolutely, all you have to do is change the following two lines First update the llm import near the top of your file ```python from langchain.llms import LlamaCpp ```...
Hi @ShoufaChen unfortunately this has to do with a recent change to the model format in `llama.cpp`. To fix this you'll just need to migrate the model flie as follows....
Hey @kerbi yes that's correct you need to essentially build it on the system that you're going to be running it on in order for the compiler to detect processor...
Thanks for the contribution I'll try to address this in a more general way with https://github.com/abetlen/llama-cpp-python/issues/17 by allowing you to load multiple models and set defaults based on the specific...