Zhengfu He

Results 39 comments of Zhengfu He

Sure! (Continue) pre-training and finetuning with the diffusion objective are both supported.

Hi, Do you mean `modeling_elasticbert.py`? I received an email reporting this but forgot to update :( Traceback (most recent call last): File "predict.py", line 5, in from models.modeling_elasticbert import ElasticBertForPreTraining...

Hi, I deleted the missing import in the last commit https://github.com/Hzfinfdu/Diffusion-BERT/commit/404d222f1535fa971908c2099b227d9b94d55fdc. Did you clone before that? Pulling again may help.

I suspect that `train_data` should be a datasets.Dataset object but somehow it is a dict now. Did you modify the code around there?

Hi, Yes, we generate all tokens in one diffusion step. We use ddim sampling to predict $x_0$ and get $x_{t-1}$ from the forward process. The demonstration in Table 1 shows...

Yes, that's right. DDIM sampling helps to trade off speed and generation quality. And predicting $x_0$ directly is closer to the MLM training objective.

Hi, In fact, $H(x_0^i)$ can be calculated in many ways. We calculate the entropy of each token by the negative logarithm of its frequency in the tokenized training corpus. Since...

This was initially used to test the perplexity of the generated sentences. It is removed in the final codebase. This line is now removed from predict.py

Hi, thank you for your question! I have to admit that we made a mistake on that statement. We will remove this in our later versions. Nevertheless, we think the...

Hi, We computed the BLEU score with all test data as references and reported the average BLEU score of each generated sentence. We sampled 1K sentences respectively for evaluating BLEU...