fabricator
fabricator copied to clipboard
[EMNLP 2023 Demo] fabricator - annotating and generating datasets with large language models.
Initial draft/concept for symbolic text generation with [Outlines](https://github.com/outlines-dev/outlines). * The draft currently only includes the generation of JSON text (Outline also supports integers, choices, and regexes) * I build a...
Uniform always samples 1 class out of all classes whereas stratified 1 example per class. Might be confusing, to be improved.
Paper: https://aclanthology.org/2022.findings-emnlp.269.pdf They use IF to determine the informativeness of examples which they in turn use as in-context examples to steer the generation process. Works well for classification tasks and...
As discussed with @aynetdia
Tools like [Guidance](https://github.com/guidance-ai/guidance) help during text generation by not necessarily improving on the prompt. A new [paper](https://arxiv.org/pdf/2307.09702.pdf) called "**Efficient Guided Generation for Large Language Models**" does the same but with...
Instead of having a unified generation function as we have now, we might want to adjust our repo in the future in a direction such that users can pick different...
This issue tracks the progress of adding labels predicted by gpt-3.5 to a subset of the English CoNLL-03 NER data as part of our label noise benchmark.
github markdown can't take fixed width, if using regular HTML tags we need some logic to switch between light and dark mode. improve logo at some point, I can ask...
to understand the information gain from having k few shot examples compared to producing further examples using an LLM and these k few shot examples
for example: 0 (pos), 1 (neg) might become to 3 (pos), 0 (neg)