dspy
dspy copied to clipboard
Are you interested in adding support for llama-cpp-python?
Hi, I am working on this in my fork because I need to run models on the CPU and I had issues using the llama-cpp-python server.
We'd love to support that, but I don't have experience with it. Do you want to open a PR? How can I help you with that?
Yes, i will do a PR, i tried to follow the same structure of the GPT3 class and i tested with the intro notebook, the performance is not great, at least for my laptop, but its the only way to run 7b models locally i think. I am also testing AutoAWQ support for running the models in colab. I have a doubt, for instruct models that use a prompt template like <s>[INST] {prompt} [/INST]
for mistral, including that in the prompts will affect the dspy compiler?
I'd also be interested for my project. I see #191 is out for this, is this going in any time soon?
+1, also interested - I've implemented my own workaround for now, but seems #191 is almost ready so will use that as soon as it's in!
+1
+1
+1
I'm just getting started with the library, so I don't know all the intricacies, but since this hasn't had any updates in the last months (and tbh, I'm suprised that llama.cpp/llama-cpp-python lack support in a realtively big LLM-oriented tool) I hacked a quick solution for my tests:
import requests
from dsp.modules import LM
import dspy
s = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>
{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>
{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>"""
class LlamaCppPython(LM):
def __init__(self, model, url="http://localhost:8009/"):
super().__init__(model)
self.url = url if url.endswith("/") else url + "/"
def basic_request(self, prompt, **kwargs):
prompt = s.format(system_prompt="Follow the user instructions", prompt=prompt)
r = requests.post(self.url + "v1/completions", json={"prompt": prompt} | kwargs)
assert r.status_code == 200
return [i["text"].strip() for i in r.json()["choices"]]
def __call__(self, prompt, only_completed=True, return_sorted=False, **kwargs):
return self.basic_request(prompt, **kwargs)
dspy.settings.configure(lm=LlamaCppPython("aaa", url="http://localhost:8009/"))
The template the prompt is being formatted with is to match with llama 3, I guess you would need to tweak it!
Hey, I dont know/see how you can used llamacpp to load/cook LMs for Dspy. I was trying to use function calling feature on CPP on dspy.
Closed by #1347