Lucas Liebenwein

Results 14 issues of Lucas Liebenwein

Scaling up the AutoDeploy dashboard to better track model coverage

AutoDeploy

### 🚀 The feature, motivation and pitch Test 2-model (and later 1-model) spec dec with TP > 1. Maybe this test here can be extended: https://github.com/NVIDIA/TensorRT-LLM/pull/9275/files#r2557977057 ### Alternatives _No response_...

feature request
Speculative Decoding

### 🚀 The feature, motivation and pitch Now that PT LlmArgs have mostly stabilized, let's see if we can more closely align AD LlmArgs with PT LlmArgs: 1. Deprecate `AutoDeployConfig`...

feature request
AutoDeploy