fabricator
fabricator copied to clipboard
Idea on how to structure generation / annotation
Instead of having a unified generation function as we have now, we might want to adjust our repo in the future in a direction such that users can pick different approaches like:
For Generation:
ZEROGEN
: efficient zero-shot learning via dataset generation (paper)
PROGEN
: progressive dataset generation via in-context feedback (paper)
For Annotation:
CALIBRATION
: prompt-based zero-shot learning with calibration (paper)
...
At last, we should keep the possibility to generate datasets on their own, defining their own sampling strategy, sample information criterion, etc.