FasterTransformer
FasterTransformer copied to clipboard
FTBart not producing the same results as HuggingFace facebook/bart-large
Branch/Tag/Commit
main
Docker Image Version
nvcr.io/nvidia/pytorch:22.09-py3
GPU name
T4
CUDA Driver
Build cuda_11.8.r11.8/compiler.31833905_0
Reproduced Steps
https://github.com/NVIDIA/FasterTransformer/blob/main/docs/bart_guide.md
Follow the steps to set up the VM.
python ../examples/pytorch/bart/translate_example.py -time '01' -model 'facebook/bart-base'
^^ This yields identical results for FT vs HF.
However, when re-running with bart-large..
python ../examples/pytorch/bart/translate_example.py -time '01' -model 'facebook/bart-large'
The bleu score drops for FT
2023-03-23 01:58:22,043 __main__ [INFO] hf-beamsearch translates 30 batches taking 6.54 sec to translate 689 tokens, BLEU score: 6.49, 105 tokens/sec. (593 words, 91 words/sec)
2023-03-23 01:58:22,043 __main__ [INFO] ft-beamsearch translates 30 batches taking 3.33 sec to translate 747 tokens, BLEU score: 3.32, 224 tokens/sec. (463 words, 139 words/sec)
Thank you for the report. We will take a look.
Seeing the same issue. @byshiue were you able to check?