FasterTransformer FTBart not producing the same results as HuggingFace facebook/bart-large

FTBart not producing the same results as HuggingFace facebook/bart-large

Open noahliebs opened this issue 2 years ago • 2 comments

Branch/Tag/Commit

main

Docker Image Version

nvcr.io/nvidia/pytorch:22.09-py3

GPU name

CUDA Driver

Build cuda_11.8.r11.8/compiler.31833905_0

Reproduced Steps

https://github.com/NVIDIA/FasterTransformer/blob/main/docs/bart_guide.md

Follow the steps to set up the VM.

python ../examples/pytorch/bart/translate_example.py -time '01' -model 'facebook/bart-base'
^^ This yields identical results for FT vs HF.

However, when re-running with bart-large..

python ../examples/pytorch/bart/translate_example.py -time '01' -model 'facebook/bart-large'

The bleu score drops for FT

2023-03-23 01:58:22,043 __main__ [INFO] hf-beamsearch translates 30 batches taking 6.54 sec to translate 689 tokens, BLEU score: 6.49, 105 tokens/sec. (593 words, 91 words/sec)
2023-03-23 01:58:22,043 __main__ [INFO] ft-beamsearch translates 30 batches taking 3.33 sec to translate 747 tokens, BLEU score: 3.32, 224 tokens/sec. (463 words, 139 words/sec)

Mar 23 '23 02:03 noahliebs

Thank you for the report. We will take a look.

Mar 24 '23 11:03 byshiue

Seeing the same issue. @byshiue were you able to check?

Aug 17 '23 13:08 dhruvmullick

FasterTransformer FasterTransformer copied to clipboard

FTBart not producing the same results as HuggingFace facebook/bart-large

Branch/Tag/Commit

Docker Image Version

GPU name

CUDA Driver

Reproduced Steps

FasterTransformer
FasterTransformer copied to clipboard