LoRA Not able to reproduce the scores using provided checkpoint on NLG tasks

Not able to reproduce the scores using provided checkpoint on NLG tasks

Open ylli0218 opened this issue 8 months ago • 5 comments

Hi, I was able to reproduce the GLUE benchmark results but not the NLG task.

For NLG tasks, I downloaded the checkpoint for GPT2-M and follow the step 2,3,4 in the instructions

https://github.com/microsoft/LoRA/tree/main/examples/NLG

However, the scores were either extremely low, or there were errors during evaluations.

For e2e task I got:

SCORES: BLEU: 0.0000 NIST: 0.0196 METEOR: 0.0034 ROUGE_L: 0.0072 CIDEr: 0.0000

For WebNLG and DART, I see error: Error: test and reference not same length ERROR ON COMPUTING METEOR. MAKE SURE YOU HAVE JAVA INSTALLED GLOBALLY ON YOUR MACHINE. I do have java installed. And the other scores were also low: BLEU BLEU NLTK METEOR chrF++ TER BERT-SCORE P BERT-SCORE R BERT-SCORE F1 BLEURT 0 0 -1 0.11 1.96 0 0 0 -1

Any suggestions? Thank you.

Oct 12 '23 03:10 ylli0218

LoRA LoRA copied to clipboard

Not able to reproduce the scores using provided checkpoint on NLG tasks

LoRA
LoRA copied to clipboard