Translation example with ctranslate2's Translator.
As we have added support for HF model translation via CrossFit, we are working towards performance improvement with ctranslate2. This work depends on adding support for ctranslate2 in CrossFit, and then will need to create pipeline for this work in NDC.(Draft PR)
With a workaround for ctranslate2 in CrossFit, huge performance improvement was seen. On single GPU, following is the performance :
| Experiment | Standalone pytorch inference | Standalone + ctranslate2 | Crossfit+ctranslate2 |
|---|---|---|---|
| Inference time | ~1hr 50mins | 23min 54sec | 6min 29sec (including extra processing for workarund : 3sec) |
| BLEU score | - | 0.9585 | 0.9586 |
BLEU score was calculated w.r.t Standalone pytorch inference on 74058 sentences.
CC: @arhamm1 for awareness for the work here
Added an example notebook here
Moving to next sprint per Arham's approval.
Blocked by HF issue?
No longer blocked - Vibhu is making Crossfit changes, and once these are complete this can proceed. Pushing to December sprint.
@uahmed93 What is the latest on this?
This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 7 days.
Closed by https://github.com/NVIDIA-NeMo/Curator/pull/336.