TensorRT-LLM feat: Add canary recipe support canary-1b and canary-1b flash with new prompt format

feat: Add canary recipe support canary-1b and canary-1b flash with new prompt format

Open anand-nv opened this issue 8 months ago • 5 comments

trafficstars

Add support for NeMo's conformer encoder-transformer decoder models (canary-1b and canary-1b flash)

Mar 19 '25 03:03 anand-nv

The encoder is a fastconformer encoder. Initial attempts at implementing it using TRT-LLM layers resulted in poorer perf than onnx->trt.

Mar 19 '25 18:03 anand-nv

I'm working on my own attempts of optimizing TRT versions of Conformer models, if you can share your initial attempts I can help and contribute back my results, I have a few accepted PRs here and in ModelOPT repo

Mar 19 '25 21:03 MahmoudAshraf97

I'm working on my own attempts of optimizing TRT versions of Conformer models, if you can share your initial attempts I can help and contribute back my results, I have a few accepted PRs here and in ModelOPT repo

Sure but this isn't a blocker for merging this for now.

Mar 20 '25 08:03 anand-nv

@anand-nv Hi, TRT-LLM has already moved its development to github for now. Can you rebase your MR based on the latest main branch to prepare a fresh MR?

Thanks June

Mar 24 '25 05:03 juney-nvidia

@juney-nvidia - rebase done

Mar 24 '25 05:03 anand-nv

@juney-nvidia @Shixiaowei02 Can this be reviewed and merged?

Apr 03 '25 06:04 anand-nv

Hi @anand-nv, you've done a great job! Do you think this MR will be merged any time soon?

Jul 08 '25 20:07 YaKalmar0

TensorRT-LLM TensorRT-LLM copied to clipboard

feat: Add canary recipe support canary-1b and canary-1b flash with new prompt format

TensorRT-LLM
TensorRT-LLM copied to clipboard