TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

feat: Add canary recipe support canary-1b and canary-1b flash with new prompt format

Open anand-nv opened this issue 8 months ago • 5 comments
trafficstars

Add support for NeMo's conformer encoder-transformer decoder models (canary-1b and canary-1b flash)

anand-nv avatar Mar 19 '25 03:03 anand-nv

The encoder is a fastconformer encoder. Initial attempts at implementing it using TRT-LLM layers resulted in poorer perf than onnx->trt.

anand-nv avatar Mar 19 '25 18:03 anand-nv

I'm working on my own attempts of optimizing TRT versions of Conformer models, if you can share your initial attempts I can help and contribute back my results, I have a few accepted PRs here and in ModelOPT repo

MahmoudAshraf97 avatar Mar 19 '25 21:03 MahmoudAshraf97

I'm working on my own attempts of optimizing TRT versions of Conformer models, if you can share your initial attempts I can help and contribute back my results, I have a few accepted PRs here and in ModelOPT repo

Sure but this isn't a blocker for merging this for now.

anand-nv avatar Mar 20 '25 08:03 anand-nv

@anand-nv Hi, TRT-LLM has already moved its development to github for now. Can you rebase your MR based on the latest main branch to prepare a fresh MR?

Thanks June

juney-nvidia avatar Mar 24 '25 05:03 juney-nvidia

@juney-nvidia - rebase done

anand-nv avatar Mar 24 '25 05:03 anand-nv

@juney-nvidia @Shixiaowei02 Can this be reviewed and merged?

anand-nv avatar Apr 03 '25 06:04 anand-nv

Hi @anand-nv, you've done a great job! Do you think this MR will be merged any time soon?

YaKalmar0 avatar Jul 08 '25 20:07 YaKalmar0