dspy icon indicating copy to clipboard operation
dspy copied to clipboard

Are you interested in adding support for llama-cpp-python?

Open DanielUH2019 opened this issue 1 year ago • 9 comments

Hi, I am working on this in my fork because I need to run models on the CPU and I had issues using the llama-cpp-python server.

DanielUH2019 avatar Oct 23 '23 19:10 DanielUH2019

We'd love to support that, but I don't have experience with it. Do you want to open a PR? How can I help you with that?

okhat avatar Oct 25 '23 15:10 okhat

Yes, i will do a PR, i tried to follow the same structure of the GPT3 class and i tested with the intro notebook, the performance is not great, at least for my laptop, but its the only way to run 7b models locally i think. I am also testing AutoAWQ support for running the models in colab. I have a doubt, for instruct models that use a prompt template like <s>[INST] {prompt} [/INST] for mistral, including that in the prompts will affect the dspy compiler?

DanielUH2019 avatar Oct 25 '23 22:10 DanielUH2019

I'd also be interested for my project. I see #191 is out for this, is this going in any time soon?

FullMetalMeowchemist avatar Nov 09 '23 21:11 FullMetalMeowchemist

+1, also interested - I've implemented my own workaround for now, but seems #191 is almost ready so will use that as soon as it's in!

b-akshay avatar Dec 22 '23 18:12 b-akshay

+1

Wintoplay avatar Apr 04 '24 10:04 Wintoplay

+1

RaminStrider avatar Apr 10 '24 15:04 RaminStrider

+1

egeres avatar May 06 '24 21:05 egeres

I'm just getting started with the library, so I don't know all the intricacies, but since this hasn't had any updates in the last months (and tbh, I'm suprised that llama.cpp/llama-cpp-python lack support in a realtively big LLM-oriented tool) I hacked a quick solution for my tests:

import requests
from dsp.modules import LM
import dspy

s = """<|begin_of_text|><|start_header_id|>system<|end_header_id|>

{system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|>

{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>"""

class LlamaCppPython(LM):

    def __init__(self, model, url="http://localhost:8009/"):
        super().__init__(model)
        self.url = url if url.endswith("/") else url + "/"

    def basic_request(self, prompt, **kwargs):
        prompt = s.format(system_prompt="Follow the user instructions", prompt=prompt)
        r = requests.post(self.url + "v1/completions", json={"prompt": prompt} | kwargs)
        assert r.status_code == 200
        return [i["text"].strip() for i in r.json()["choices"]]

    def __call__(self, prompt, only_completed=True, return_sorted=False, **kwargs):
        return self.basic_request(prompt, **kwargs)

dspy.settings.configure(lm=LlamaCppPython("aaa", url="http://localhost:8009/"))

The template the prompt is being formatted with is to match with llama 3, I guess you would need to tweak it!

egeres avatar May 06 '24 22:05 egeres

Hey, I dont know/see how you can used llamacpp to load/cook LMs for Dspy. I was trying to use function calling feature on CPP on dspy.

RaminStrider avatar May 15 '24 18:05 RaminStrider

Closed by #1347

arnavsinghvi11 avatar Aug 05 '24 17:08 arnavsinghvi11