Apurba Bose
Apurba Bose
DLFW NGC container shows this error.
output mismatch in DLFW NGC container
Tensor parallel Llama3 tutorial illustrating use of torch.distributed and nccl ops # Description Please include a summary of the change and which issue is fixed. Please also include relevant motivation...
Hi, I am interested in the waymo challenge. But I am getting this error while accessing dataset `does not have storage.objects.list access to the Google Cloud Storage bucket. Permission 'storage.objects.list'...
This PR 1. Adds rank based logging for the distributed examples 2. Corrects the fallback to pytorch case for NCCL converters 3. This with #3830 provides utilities for running distributed...
This PR addresses the case of empty tensor in torchTRT based on https://docs.nvidia.com/deeplearning/tensorrt/latest/inference-library/advanced.html#empty-tensors, and also focuses on concat operation edge case involving empty tensor TODO: Might have to separate the...
The PR addresses 1. Llama3 end to end example with complex graph lowering 2. Removal of hardcoded components of rotary embedding example
TRT-LLM installation tool for distributed 1. The download is to be done by only one GPU to avoid unnecessary downloads 2. Use of lock files in the tool for the...
Bug: FAILED conversion/test_scalar_tensor_aten.py::TestScalarTensorConverter::test_scalar_tensor_float_1 FAILED conversion/test_index_aten.py::TestIndexConverter::test_index_zero_two_dim_ITensor_mask TRT 10.13.3.9 Pytorch 2.10.0a0+b558c986e8 Error: ``` 2025-10-11T19:58:31.844970Z 01O ------------------------------ Captured log call ------------------------------- 2025-10-11T19:58:31.844990Z 01O WARNING torch_tensorrt [TensorRT Conversion Context]:logging.py:24 Environment variable NVIDIA_TF32_OVERRIDE=0 but BuilderFlag::kTF32...
Bug: FAILED models/test_models.py::test_resnet18_torch_exec_ops - AssertionError: ... TRT 10.13.3.9 Pytorch 2.10.0a0+b558c986e8 Error: ``` 2025-10-11T05:44:58.858100Z 01O =================================== FAILURES =================================== 2025-10-11T05:44:58.858110Z 01O _________________________ test_resnet18_torch_exec_ops _________________________ 2025-10-11T05:44:58.858130Z 01O 2025-10-11T05:44:58.858140Z 01O ir = 'dynamo' 2025-10-11T05:44:58.858150Z...