dspy
dspy copied to clipboard
Sensitivity of Signature attribute naming
I have been getting an odd bug. With the below Signature
, I get an error (specifically, TemplateV2.query
throws an AttributeError
because it is trying to split a list?). But if I change examples
to context
, then it seems to work fine. Are certain attribute names protected?
import dspy
gpt35 = dspy.OpenAI(model="gpt-3.5-turbo")
dspy.settings.configure(lm=gpt35)
class GenerateSummary(dspy.Signature):
"""Generate a summary of the provided examples"""
examples = dspy.InputField(desc="A balanced set of examples")
summary = dspy.OutputField(desc="A straightforward summary")
generate_summary = dspy.ChainOfThought(GenerateSummary)
generate_summary(context=[
"Humpty Dumpty is a character in an English nursery rhyme, probably originally a riddle and one of the best known in the English-speaking world.",
"Ring Around the Rosie, is a nursery rhyme. Descriptions first emerge in the mid-19th century, but are reported as dating from decades before, and similar rhymes are known from across Europe"
])
Ah that’s a known issue that we’ll be fixing. For now, pass a format keyword argument to dspy.InputField that takes a list and returns a formatted string.
import dsp dsp.passages2text
is one function that achieves that.
Wow, what a quick response! Thanks!
My actual use case was slightly more complicated---I wanted to pass [ex.with_inputs('text', 'label') for ex in dataset]
. Should that still work? I just tried it with my above workaround (using context
instead), and I think they were ignored.
There are no fields in your code called text and label. How are these suppose to be used
Oh sorry, I wasn't clear, my real data is a list of Examples
which have those fields. I guess the question is whether I should define a format
function specific to my Examples
?
You may want to define a dspy.Module class and in the forward function take any argument names you like but pass examples
into the chain of thought method.
Let me know if I should share an example of that
Yeah, I suspect I'm doing something wrong, since I don't need to pass anything to forward
(awaiting your paper so I can understand the abstractions better!). Right now my Module looks something like the following (in practice there's more going on, but I think this gets the main idea across)
class Summarizer(dspy.Module):
def __init__(
self,
trainset: list[Example],
num_iters: int = 4,
items_per_sample: int = 10,
):
super().__init__()
self.trainset = trainset
self.num_iters = num_iters
self.items_per_sample = items_per_sample
self.generate_summary = dspy.ChainOfThought(GenerateSummary)
def forward(self):
outputs = []
for _ in self.num_iters:
train_ex_sample = random.sample(self.trainset, k=self.items_per_sample)
result = self.generate_summary(examples=train_ex_sample) # throws the error here
outputs.append(result)
return outputs
In the long run, I could see using simple retrieval to target diversity in the sampling.
I'm not 100% clear on the reason for passing the training set into forward
. I don't think you're trying to train the program (if you are, I'd use one of the teleprompters instead).
But this works fwiw:
import dspy
import random
from dsp import passages2text
trainset = [dspy.Example(text=f"my long string #{idx}", label=f"shorter string #{idx}") for idx in range(3)]
trainset = [x.with_inputs('text') for x in trainset]
devset = [dspy.Example(text=f"my long string #{idx}", label=f"shorter string #{idx}") for idx in range(3)]
devset = [x.with_inputs('text') for x in devset]
class GenerateSummary(dspy.Signature):
"""Generate a summary of the provided examples"""
examples = dspy.InputField(desc="A balanced set of examples", format=passages2text)
summary = dspy.OutputField(desc="A straightforward summary")
class Summarizer(dspy.Module):
def __init__(
self,
trainset,
num_iters: int = 2,
items_per_sample: int = 2,
):
super().__init__()
self.trainset = trainset
self.devset = devset
self.num_iters = num_iters
self.items_per_sample = items_per_sample
# submodules
self.generate_summary = dspy.ChainOfThought(GenerateSummary)
def forward(self):
outputs = []
for _ in range(self.num_iters):
train_ex_sample = random.sample(self.trainset, k=self.items_per_sample)
result = self.generate_summary(examples=[x.text for x in train_ex_sample])
outputs.append(result)
return outputs
summarizer = Summarizer(trainset=trainset)
summarizer()
Excellent, thank you so much! Yeah, I'm not training here. Passing the data directly to forward would probably be fine too.