distilabel icon indicating copy to clipboard operation
distilabel copied to clipboard

DSPy as a Step

Open burtenshaw opened this issue 10 months ago • 0 comments

This draft PR proposes a way of using a DSPy prediction module as a text generation step. The advantage of this is that text generation could use an optimised, evaluated, prompt for generating responses.

This does not full advantage of DSPy and a deeper integration could be possible. For example, distilabel could perform the optimisation and/or evaluation within DSPy.

Example usage

Make an optimised DSPy program

import dspy

class CoT(dspy.Module):
    def __init__(self):
        super().__init__()
        self.prog = dspy.ChainOfThought("question -> answer")

    def forward(self, question):
        return self.prog(question=question)

cot_bs = CoT()

# https://github.com/stanfordnlp/dspy/blob/7227e7081d8edc3d0ffc2729f7c97871aa61338b/examples/math/gsm8k/turbo_8_8_10_gsm8k_200_300.json

Use a DSPy module in a Distilabel pipeline


from distilabel.pipeline.local import Pipeline
from distilabel.steps.task.text_generation import TextGeneration
from distilabel.llm.openai import OpenAILLM

pipeline = Pipeline()
llm = OpenAILLM(model="gpt-3.5-turbo")
task = DSPyProgram(
    name="program",
    llm=llm,
    pipeline=pipeline,
    program=cot_bs.prog,
)

result = next(
    task.process([{"instruction": "How many fish are there in a dozen fish?"}])
)
print(result)

# Output
# [{'instruction': 'How many fish are there in a dozen fish?', 'model_name': 'gpt-3.5-turbo', 'generation': '12'}]

burtenshaw avatar Apr 23 '24 08:04 burtenshaw