Diffusion-BERT icon indicating copy to clipboard operation
Diffusion-BERT copied to clipboard

GPT mentioend in Figure3

Open jzhang38 opened this issue 3 years ago • 1 comments

Dear authors,

Thanks for open-sourcing your wonderful work.

You mention GPT in Figure 3 when comparing the Pareto front across different models("AR models of the same size"). May I ask if this is a pre-trained GPT (e.g. GPT2-small) finetuned on the LM1B dataset, or a model with GPT architecture trained from scrach on the LM1B training set?

jzhang38 avatar Dec 01 '22 07:12 jzhang38

Hi,

Thank you for your question! We include both models in Figure 3. The red curve, which is rather close to our DiffusionBERT stands for an AR model trained from scratch and the green one for finetuned GPT2. In general, DiffusionBERT still falls behind pretrained AR models in terms of generation quality.

Best, Zhengfu

Hzfinfdu avatar Dec 01 '22 07:12 Hzfinfdu