openai-cookbook
openai-cookbook copied to clipboard
Fine-tuning for entity extraction?
Hi Team, FIne-tuning for entity extraction seems to be straight forward in zero-shot setting, however, when fine-tuning da Vinci or ada, for entity extraction, the model seems to hallucinate. Are there any set prompts/ recommended ways fine-tune the model?
Hard to say without seeing your data, but here a few suggestions:
- Use a stop sequence (e.g., " END") in your fine-tuning completions so the model can signal a stop instead of continuing to generate hallucinated text
- Include diverse examples in your training data (e.g., if you're seeing hallucination on examples with no extracted entities, make sure you have training examples with no extracted entities)
- Include a large quantity of training data (hard to give exact rules of thumb, but I'd guess 1,000+ examples; if you have a quality metric you can see how that quality improves with training set size to estimate the returns on increasing the training set size)
- Use a function to check that the extracted entities have string matches with the original text; filter those that do not
Thank you for the reply! Yes, I did include 500+ training samples which were diverse.
Are there any suggested templates to follow?
For instance, I tried this following template: Input: "Barack Obama was a former president of the United states" END Output: "(PERSON) Barack Obama (\PERSON) was a former president of the (COUNTRY) United states (\COUNTRY)"
This sort of fine-tuning did not result in good results. Instead, I got sentences that seemed like continuations to the input sentence.
Can't help, unfortunately. Good luck in your experimentation.