Zhengfu He comments

Results 39 comments of


                                            Zhengfu He

How to fine-tune it

Sure! (Continue) pre-training and finetuning with the diffusion objective are both supported.

Hi, Do you mean `modeling_elasticbert.py`? I received an email reporting this but forgot to update :( Traceback (most recent call last): File "predict.py", line 5, in from models.modeling_elasticbert import ElasticBertForPreTraining...

unfinished codebase?

Hi, I deleted the missing import in the last commit https://github.com/Hzfinfdu/Diffusion-BERT/commit/404d222f1535fa971908c2099b227d9b94d55fdc. Did you clone before that? Pulling again may help.

why TypeError？

I suspect that `train_data` should be a datasets.Dataset object but somehow it is a dict now. Did you modify the code around there?

Inquiry on some details of the method.

Hi, Yes, we generate all tokens in one diffusion step. We use ddim sampling to predict $x_0$ and get $x_{t-1}$ from the forward process. The demonstration in Table 1 shows...

Inquiry on some details of the method.

Yes, that's right. DDIM sampling helps to trade off speed and generation quality. And predicting $x_0$ directly is closer to the MLM training objective.

Inquiry on some details of the method.

Hi, In fact, $H(x_0^i)$ can be calculated in many ways. We calculate the entropy of each token by the negative logarithm of its frequency in the tokenized training corpus. Since...

No module named 'perplexity

This was initially used to test the perplexity of the generated sentences. It is removed in the final codebase. This line is now removed from predict.py

Lower-case in LM1B

Hi, thank you for your question! I have to admit that we made a mistake on that statement. We will remove this in our later versions. Nevertheless, we think the...

How to evaluate BLEU score on LM1B?

Hi, We computed the BLEU score with all test data as references and reported the average BLEU score of each generated sentence. We sampled 1K sentences respectively for evaluating BLEU...