Ratish Puduppully

Results 14 comments of Ratish Puduppully

Hi @ydshieh thanks for looking into the issue. In a previous checkpoint 1500, the model produced a good output for the above news article: `The Eiffel Tower is the tallest...

What is surprising is that the eval rouge fluctuates a lot till checkpoint 1500, after which it remains close to 0. I have attached below a tensorboard image of eval_rouge1...

Even more suprising, LED-Base model seems to be doing quite well! ![image](https://user-images.githubusercontent.com/3006607/179777619-16b51619-eb76-4067-ab1c-0b6d9f6287e1.png) Model output (checkpoint 1600): `The Eiffel Tower in Paris is the tallest structure in the world.`

Hi @ydshieh I had missed to mention this in the original issue description. I had experimented with setting the global attention mask during training. But it didn't change the outcome.

I had added the line `model_inputs["global_attention_mask"] = [[1 if y == tokenizer.cls_token_id else 0 for y in x] for x in model_inputs["input_ids"]]` into the code after https://github.com/huggingface/transformers/blob/0d0aada56444ad554021947addaa035feb55948f/examples/pytorch/summarization/run_summarization.py#L536

The error will occur if the sizes of the content plan and training data do not match.

Hi @ghtaro, Nice to know that you found the code useful. I am not sure about the root cause of the issue you are facing. But as mentioned in https://github.com/ratishsp/data2text-plan-py/issues/26#issuecomment-769032836,...

Hi @happycjksh, I think I now understand what the root cause of the issue is. It is indeed related to https://github.com/ratishsp/data2text-plan-py/issues/34. The lengths of ```train.json``` and ```train_content_plan.txt``` are different because...

Oh. I am not sure why the file is empty. You can use the train-roto-ptrs.txt file from the location https://drive.google.com/drive/folders/1R_82ifGiybHKuXnVnC8JhBTW8BAkdwek