Diffusion-BERT icon indicating copy to clipboard operation
Diffusion-BERT copied to clipboard

ACL'2023: DiffusionBERT: Improving Generative Masked Language Models with Diffusion Models

Results 23 Diffusion-BERT issues
Sort by recently updated
recently updated
newest added

Greetings, I am currently working on diffusion for text generation as well. In your paper you have included the PPL of DiffusionLM in your results for comparison. I would I...

python predict_downstream_condition.py --ckpt_path model_name_roberta-base_taskname_qqp_lr_3e-05_seed_42_numsteps_2000_sample_Categorical_schedule_mutual_hybridlambda_0.0003_wordfreqlambda_0.0_fromscratch_False_timestep_none_ckpts/best\(38899\).th using standard schedule with num_steps: 2000. Traceback (most recent call last): File "predict_downstream_condition.py", line 101, in model.load_state_dict(ckpt['model']) File "/opt/conda/envs/diff/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1672, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError:...

In the function def discrete_diffusion_predict_fn(), self.device() is called, however the self is not defined in this function. Code snippet here, self.device() is giving the error: ``` if predict_x0: init_state =...

Thanks for the code release! Heads up for other users who want to resume training from a checkpoint: you will want to 1. de-indent DDP_main.py:80 so that all devices can...

Why am I running Word. py,the following error occurred: Traceback (most recent call last): File "C:\GithubProjects\Diffusion-BERT-main\word_freq.py", line 18, in for iid in data['input_ids']: TypeError: string indices must be integers

Hi, When I was trying to load the checkpoint, it gives the following error: Missing key(s) in state_dict: Missing key(s) in state_dict: "bert.embeddings.position_ids", "bert.embeddings.word_embeddings.weight", "bert.embeddings.position_embeddings.weight", "bert.embeddings.token_type_embeddings.weight", "bert.embeddings.LayerNorm.weight", "bert.embeddings.LayerNorm.bias", "bert.encoder.layer.0.attention.self.query.weight", "bert.encoder.layer.0.attention.self.query.bias",...

As said in the second paragraph of Section 4.3, "We attribute the superior performance of DiffusionBERT to its onetime sampling of all tokens". I wonder the meaning of "onetime sampling...

Hi, I try to train a model on a different dataset but the loss doesn't change that much. I wonder if you could release the checkpoints so could first load...