distilabel
distilabel copied to clipboard
DSPy as a Step
This draft PR proposes a way of using a DSPy prediction module as a text generation step. The advantage of this is that text generation could use an optimised, evaluated, prompt for generating responses.
This does not full advantage of DSPy and a deeper integration could be possible. For example, distilabel could perform the optimisation and/or evaluation within DSPy.
Example usage
Make an optimised DSPy program
import dspy
class CoT(dspy.Module):
def __init__(self):
super().__init__()
self.prog = dspy.ChainOfThought("question -> answer")
def forward(self, question):
return self.prog(question=question)
cot_bs = CoT()
# https://github.com/stanfordnlp/dspy/blob/7227e7081d8edc3d0ffc2729f7c97871aa61338b/examples/math/gsm8k/turbo_8_8_10_gsm8k_200_300.json
Use a DSPy module in a Distilabel pipeline
from distilabel.pipeline.local import Pipeline
from distilabel.steps.task.text_generation import TextGeneration
from distilabel.llm.openai import OpenAILLM
pipeline = Pipeline()
llm = OpenAILLM(model="gpt-3.5-turbo")
task = DSPyProgram(
name="program",
llm=llm,
pipeline=pipeline,
program=cot_bs.prog,
)
result = next(
task.process([{"instruction": "How many fish are there in a dozen fish?"}])
)
print(result)
# Output
# [{'instruction': 'How many fish are there in a dozen fish?', 'model_name': 'gpt-3.5-turbo', 'generation': '12'}]