TensorRT-LLM

TensorRT-LLM copied to clipboard

Reame
Issues

[DRAFT] Introducing multi-vocab token sampling for audio generation

Open vklimkov-nvidia opened this issue 7 months ago • 6 comments

Multi-token support

Introduce multi-token sampling with autoregressive transformers to support audio generation. This is a draft PR to trigger pipelines for code quality check. Once issues fixed, the change is meant to go into https://github.com/rmittal-github/TensorRT-LLM/tree/release/0.19

The change originally based on https://gitlab-master.nvidia.com/ftp/tekit/-/merge_requests/8319, but rebased to work with v0.19.

May 02 '25 11:05 vklimkov-nvidia

/bot run

May 04 '25 03:05 juney-nvidia

PR_Github #4016 [ run ] triggered by Bot

May 04 '25 03:05 tensorrt-cicd

PR_Github #4016 [ run ] completed with state FAILURE /LLM/release-0.19/L0_MergeRequest_PR pipeline #119 completed with status: 'FAILURE'

May 04 '25 03:05 tensorrt-cicd

/bot run --disable-fail-fast

May 17 '25 00:05 JyChang012

PR_Github #5547 [ run ] triggered by Bot

May 17 '25 01:05 tensorrt-cicd

PR_Github #5547 [ run ] completed with state FAILURE /LLM/release-0.19/L0_MergeRequest_PR pipeline #126 completed with status: 'FAILURE'

May 17 '25 05:05 tensorrt-cicd

We do not accept any changes in the release branch. Please target main.

May 19 '25 07:05 MartinMarciniszyn

Closing since no updates from requester after https://github.com/NVIDIA/TensorRT-LLM/pull/4030#issuecomment-2889886525. Feel free to reopen!

Jun 05 '25 20:06 poweiw