LoRA
LoRA copied to clipboard
Not able to reproduce the scores using provided checkpoint on NLG tasks
Hi, I was able to reproduce the GLUE benchmark results but not the NLG task.
For NLG tasks, I downloaded the checkpoint for GPT2-M and follow the step 2,3,4 in the instructions
https://github.com/microsoft/LoRA/tree/main/examples/NLG
However, the scores were either extremely low, or there were errors during evaluations.
For e2e task I got:
SCORES: BLEU: 0.0000 NIST: 0.0196 METEOR: 0.0034 ROUGE_L: 0.0072 CIDEr: 0.0000
For WebNLG and DART, I see error: Error: test and reference not same length ERROR ON COMPUTING METEOR. MAKE SURE YOU HAVE JAVA INSTALLED GLOBALLY ON YOUR MACHINE. I do have java installed. And the other scores were also low: BLEU BLEU NLTK METEOR chrF++ TER BERT-SCORE P BERT-SCORE R BERT-SCORE F1 BLEURT 0 0 -1 0.11 1.96 0 0 0 -1
Any suggestions? Thank you.