fairseq-detect-hallucination icon indicating copy to clipboard operation
fairseq-detect-hallucination copied to clipboard

The amount of synthetic data

Open Yuran-Zhao opened this issue 1 year ago • 0 comments

Hi, I personally find your work very interesting!

There is a little question though. I wonder how much synthetic data did you generate to train the final predictor? Are the synthetic data based on the $D_{train}$, which contains around 4.77M sentences in total?

Yuran-Zhao avatar Jun 03 '23 12:06 Yuran-Zhao