Fridah-nv
Fridah-nv
/bot run --disable-fail-fast --stage-list "DGX_H100-4_GPUs-PyTorch-[Post-Merge]"
> TODO: DeepseekV3 weights are in FP8. Need to handle this case to run e2e example with weights I think we currently don't have example support for quantized model not...
I wonder if this change enables `deepseek-ai/DeepSeek-R1` to run as well?
> when we have a pattern matched node from a previous pattern matcher and then have another pattern matcher that uses that pattern matched node as input, there is an...
> def _interleaved_rope_pattern2(q, k, cos, sin, unsqueeze_dim=1): b, h, s, d = q.shape q = q.view(b, h, s, d // 2, 2).transpose(4, 3).reshape(b, h, s, d) b, h, s, d...
merged in https://github.com/nv-auto-deploy/TensorRT-LLM/pull/7
/bot run
/bot skip
/bot skip
/bot skip --comment "minor document update"