dspy icon indicating copy to clipboard operation
dspy copied to clipboard

Template issues when fields are missing from demos

Open thomasahle opened this issue 11 months ago • 4 comments

Consider this DSPy program:

    demos = [dspy.Example(input="What is the speed of light?", output="3e8")]
    program = LabeledFewShot(k=len(demos)).compile(
        student=dspy.TypedPredictor("input -> thoughts, output"),
        trainset=[ex.with_inputs("input") for ex in demos],
    )
    dspy.settings.configure(lm=DummyLM(["My thoughts", "Paris"]))
    assert program(input="What is the capital of France?").output == "Paris"

You would think the inspect_history(n=1) to look like:

Given the fields `input`, produce the fields `output`, `thoughts`.

---

Follow the following format.

Input: ${input}
Thoughts: ${thoughts}
Output: ${output}

---

Input: What is the speed of light?
Output: 3e8

---

Input: What is the capital of France?
Output: My thoughts
Thoughts: Paris

Or in some other reasonable way handle the lack of "thoughts" in the labeled data.

However, what we get instead is

Given the fields `input`, produce the fields `thoughts`, `output`.

---

Follow the following format.

Input: ${input}
Thoughts: ${thoughts}
Output: ${output}

---

Input: What is the speed of light?
Output: 3e8

Input: What is the capital of France?
Thoughts: My thoughts
Output:Paris

Which has a big problem: The --- line is missing. Whatever solution we have to "fields missing from examples", this shouldn't be it.

Things get even worse if the last field is missing, rather than a field in the middle.

Consider this DSPy program:

    demos = [dspy.Example(input="What is the speed of light?", output="3e8")]
    program = LabeledFewShot(k=len(demos)).compile(
        student=dspy.TypedPredictor("input -> output, thoughts"),
        trainset=[ex.with_inputs("input") for ex in demos],
    )
    dspy.settings.configure(lm=DummyLM(["My thoughts", "Paris"]))
    assert program(input="What is the capital of France?").output == "Paris"

Where I moved "thoughts" to be after "output". Now I get this trace:

Given the fields `input`, produce the fields `output`, `thoughts`.

---

Follow the following format.

Input: ${input}
Output: ${output}
Thoughts: ${thoughts}

---

Input: What is the capital of France?
Output: My thoughts
Thoughts:Paris

We see that the example has completely disappeared. This confused me for quite a while.

thomasahle avatar Mar 14 '24 20:03 thomasahle