TensorRT-LLM issues

[https://nvbugs/5625743][test] Adding test_batch_completions_beam_sea…

4

…rch_streaming to cover multi-beam streaming cases required by the NIM team. ## Summary by CodeRabbit * **Tests** * Added comprehensive test coverage for beam search functionality with streaming support in...

SimengLiu-nv

[None][doc] fix mtp.py typo

10

## Summary by CodeRabbit ## Release Notes * **Documentation** * Updated draft model naming labels and references in documentation for consistency. ## Description ## Test Coverage ## PR Checklist Please...

attack204

Community want to contribute

[#8948][feat] Support custom sharding config

4

Fixes #9154 Fixes #8948 ## Summary by CodeRabbit * **New Features** * Added manual tensor parallelism sharding configuration option for auto-deployment workflows. Users now have granular control over how individual...

greg-kwasniewski1

AutoDeploy

[https://nvbugs/5680310][fix]Fix ctx only timed out test

4

## Summary by CodeRabbit * **Improvements** * Enhanced timeout messaging for KV cache transfer operations. * **Tests** * Updated KV cache transfer backend configuration in test cases. * Re-enabled previously...

pcastonguay

[None][feat] Add support for KVCache reuse for DSAv32

Based on this PR https://github.com/NVIDIA/TensorRT-LLM/pull/9376 from @chang-l with minor changes to support KVCache reuse. @coderabbitai summary ## Description ## Test Coverage ## PR Checklist Please review the following before submitting...

Tabrizian

Test

10

## Summary by CodeRabbit ## Release Notes * **New Features** * Added CI-specific image tagging functionality for improved version management in continuous integration builds. * Introduced dedicated CI build pipeline...

ZhanruiSunCh

[Feature]: AutoDeploy: align AD LlmArgs more closely with PT LlmArgs

2

### 🚀 The feature, motivation and pitch Now that PT LlmArgs have mostly stabilized, let's see if we can more closely align AD LlmArgs with PT LlmArgs: 1. Deprecate `AutoDeployConfig`...

lucaslie

feature request

AutoDeploy

TensorRT-LLM
TensorRT-LLM copied to clipboard

Metadata

[https://nvbugs/5625743][test] Adding test_batch_completions_beam_sea…

[None][doc] fix mtp.py typo

[#8948][feat] Support custom sharding config

[https://nvbugs/5680310][fix]Fix ctx only timed out test

[None][feat] Add support for KVCache reuse for DSAv32

Test

[Feature]: AutoDeploy: align AD LlmArgs more closely with PT LlmArgs

← Metadata

Owner

Metadata

TensorRT-LLM TensorRT-LLM copied to clipboard

Metadata

[https://nvbugs/5625743][test] Adding test_batch_completions_beam_sea…

[None][doc] fix mtp.py typo

[#8948][feat] Support custom sharding config

[https://nvbugs/5680310][fix]Fix ctx only timed out test

[None][feat] Add support for KVCache reuse for DSAv32

Test

[Feature]: AutoDeploy: align AD LlmArgs more closely with PT LlmArgs

← Metadata

Owner

Metadata

TensorRT-LLM
TensorRT-LLM copied to clipboard