Python Planner not _quite_ getting things right
Describe the bug
This isn't exactly a bug, but more a question about the meta-meta-prompting techniques required for semantic functions and asks for the planner.
This all relates to #1064 , where I'm trying to add a GroundingSkill to the samples.
The grounding skill has two semantic functions:
- ExtractEntities
- ReferenceCheckEntities
The basic idea is that the user would supply a questionable text and a grounding text, with the goal of finding out if the former is supported by the latter. In the long term, this would be done by examining claims, but looking for named entities is a good starting point. Hence the two skills - my goal is that the planner should use the ExtractEntities function on the 'questionable' text, and then pass the list of extracted entities and the grounding text to the ReferenceCheckEntities skill.
To Reproduce
I'm setting up the 'ask' as:
I have been given the following summary text:
[SUMMARY_TEXT]
I like cars and trucks
[/SUMMARY_TEXT]
This was based on the following original text:
[ORIGINAL_TEXT]
My sister and I both like aeroplanes and trucks
[/ORIGINAL_TEXT]
Make a list of things related to transportation which are in the summary, but which are not grounded in the original.
When I request the planner to show me its results, I get:
{
"input": "[ORIGINAL_TEXT]My sister and I both like aeroplanes and trucks[/ORIGINAL_TEXT]",
"subtasks": [
{"function": "GroundingSkill.ExtractEntities", "args": {"topic": "transportation"}},
{"function": "GroundingSkill.ReferenceCheckEntities", "args": {"reference_context": "[SUMMARY_TEXT]I like cars and trucks[/SUMMARY_TEXT]"}}
]
}
Expected behavior
I should say that the result is close. The skills are being used in the correct order, and the topic argument to ExtractEntities is as I would expect (although it has missed an example_entities argument). However:
- When taking the texts, the planner has included the [TAG] parts
- The texts have been switched, with the input being the original and the grounding being the summary
This switching of texts is what one would want to do when checking for catastrophic omissions (and is the reason for the functions being separate).
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: Win11 with Python 3.11
- Endpoint:
text-davinci-003(for real this time :-/ )
Additional context
This may be related to #1063 since the semantic functions both have more than one argument.
@lemillermicrosoft - can you take a look at this?
Yeah this is tricky. Our general advice would be to use a more capable model like gpt4 to do planning. We've found that it performs a lot better. Other than that, one other thing you can try is to include examples of the functions you want within the planner prompt itself. This would follow the few-shot learning paradigm.
It also might be better to implement a custom planner that is more tuned to what you want it to do.
Shortly after I first posted this, I did discover I could inject my own prompt, and that did help (I don't have a handy GPT-4 endpoint to use, unfortunately). However, making the prompt too close to what I'm trying to do does sort of reduce the benefit of the planner.
@riedgar-ms Yeah adding additional context to the prompt is always helpful. Making use of things like embeddings and memory can supercharge planner. I'd definitely encourage you to check that out.
Closing this issue for now, but please raise a new one with your new discoveries!