Dheeraj Peri issues

Results 69 issues of


                                            Dheeraj Peri

🐛 [Bug] Flux perf scripts issue

## Bug Description ```py 1) Add a README,md with clear instructions on how to run this. Also add `pip install gradio` or any other deps required to the README. The...

bug

🐛 [Bug] test_scatter fails in CI

## Bug Description https://github.com/pytorch/TensorRT/actions/runs/13168805591/job/36755789477?pr=3382 ## To Reproduce Steps to reproduce the behavior: 1. 2. 3. ## Expected behavior ## Environment > Build information about Torch-TensorRT can be found by turning...

bug

↔ [Converter] Add support for torch.ops.aten.native_dropout.default in Torch-TensorRT

## [DEBUG | torch_tensorrt.dynamo.partitioning._global_partitioner]: Unsupported or Excluded Nodes: - torch.ops.aten.native_dropout.default + Operator Count: 2 - **Function Schema**: - **Original PyTorch API**: - **Relevant TensorRT Documentation**: ## Alternatives ## Additional context

feature request

component: converters

🐛 [Bug] Remove prepare_inputs and passing inputs during compilation.

## Bug Description Currently, we pass `trt_arg_inputs` and `trt_kwarg_inputs` to compile_module https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/_compiler.py#L682. These are actually not being used. The prepare inputs call also fails sometimes during graph parsing for dry...

bug

feat: Add support for TRT IAttention API

# Description Implement SDPA converter using TRT MHA API Fixes # (issue) ## Type of change Please delete options that are not relevant and/or add your own. - Bug fix...

cla signed

chore: Add Groot example

# Description Add Groot N1.5-3B compilation example ## Type of change Please delete options that are not relevant and/or add your own. - Bug fix (non-breaking change which fixes an...

documentation

cla signed

🐛 [Bug] run_llm.py fails with offload_module_to_cpu=True

## Bug Description After KV caching, the exported_program.module() fails with input not found error. Likely something changed in exported_program.module() API. Workaround is setting offload_module_to_cpu=False ## To Reproduce Steps to reproduce...

bug

feat: Support storing CUDAGraphs for different input profiles

# Description Currently, CUDAGraphs get reset when a different inputs are observed. Instead store a cudagraph per input shape key. This is especially important in LLM inference (where prefill and...

component: api [Python]

component: runtime

cla signed

component: dynamo

chore: move external dep installation into a separate script

# Description Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change....

component: tests

cla signed