Yixin Liu comments

Results 21 comments of


                                            Yixin Liu

About training time

Hi, the 100 epoch is just a default number. On CNN/DM the model actually reaches the best performance within one epoch. On XSum, it reaches the best performance within 5...

About the scoring mode

Hi, thank you for your interest in our project! > Does 'the score of one summary' or 'the probability to generate one summary' mean the similarity between the output and...

A small trick for memory efficiency

Thanks a lot for the suggestion! I was wondering if you have tried this modification and observed the difference in memory usage? It would be really great if you could...

Is it possible to train Huggingface model using BRIO class?

Hi, thank you for your interest in our work! You may finetune the BRIO Huggingface model using MLE. But to train the model using our method you will need to...

Specified model filename pattern was: #ID#.ref

Hi, were you able to solve this problem? One thing you could check is if the two files you provided (--ref and --hyp) contain the same number of lines. In...

Apply BRIO to other generation tasks

Hi, thank you for your interest in our work. I'd recommend several things: 1. Following CNNDM setting may not always be suitable depending on the dataset you are working on....

Apply BRIO to other generation tasks

Thanks @Hannibal046 for the comment. Hi @HillZhang1999, I'm not very familiar with GEC but I think your observation makes sense. It's very critical to have **diverse** candidates to make sure...

GPU usage increasing as training progresses

Hi, > What is the GPU size that is required to train this model? We used RTX 3090 with 24G GPU memory. > I am currently using eight 32 GB...

GPU usage increasing as training progresses

> 我用两块24G 的3090Ti 是否可以进行模型的训练 The question is if two 3090Ti are enough for model training. Yes, you should be able to train the model using 2 GPUs. But you will...

Hi, thank you for your interest in our work. I wanted to note that this loss function is adapted from [MatchSum](https://github.com/maszhongming/MatchSum/blob/master/metrics.py#L30). For `TotalLoss`, they have an explanation _here is to...