lokking

Results 2 issues of lokking

Hi, I'am very interesting in this model. I want to know why freeze LM parameters previous epochs. In my knowledge, LM parameters are fun-tuned in previous epochs(1~3) and then freeze...

First, thank you for your public code. But I have a questeion, In "DataCollatorForDenoisingTasks" code, I find batch["decoder_input_ids"] = self.shift_tokens_right(batch["input_ids"]) is original order data, but the batch["labels"] = self.add_whole_word_mask(batch["input_ids"], do_permutate)...