Tianlei Wu

Microsoft

Results 214 comments of


                                            Tianlei Wu

Allow Memory Efficient Attention Kernel to run when local window size is set

This causes wrong result and we shall avoid that. How about changing memory efficient attention to support local window [here](https://github.com/NVIDIA/cutlass/blob/56b46e2d13875b46b8f6a03f9f5ac91e2bfdc01a/examples/41_fused_multi_head_attention/fmha_grouped.h#L625-L642) to set non local elements to -inf. If change is...

Allow Memory Efficient Attention Kernel to run when local window size is set

PyTorch has implemented slide window support in efficient attention. Please take a look: https://github.com/pytorch/pytorch/blob/20b62fed21f86374b01f7d5a557a83e4d3f2d130/aten/src/ATen/native/transformers/cuda/mem_eff_attention/kernel_forward.h#L152

[Nuget] Add netstandard* to buildTransitive folders (microsoft#17010)

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal...

[Nuget] Add netstandard* to buildTransitive folders (microsoft#17010)

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models

[Nuget] Add netstandard* to buildTransitive folders (microsoft#17010)

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal...

[Nuget] Add netstandard* to buildTransitive folders (microsoft#17010)

/azp run orttraining-amd-gpu-ci-pipeline

[Nuget] Add netstandard* to buildTransitive folders (microsoft#17010)

"python format" is not triggered. I will close and reopen to trigger it.

[Nuget] Add netstandard* to buildTransitive folders (microsoft#17010)

Python format pipeline failed. Please run `lintrunner -a` to fix format. To set up lintrunner locally, see https://github.com/microsoft/onnxruntime/blob/main/docs/Coding_Conventions_and_Standards.md#linting

[ROCm] fix: obtain AMD GPU memory info through rocm_smi library

/azp run Windows ARM64 QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline,Windows GPU TensorRT CI Pipeline,ONNX Runtime Web CI Pipeline,Linux CPU CI Pipeline,Linux CPU Minimal...

[ROCm] fix: obtain AMD GPU memory info through rocm_smi library

/azp run Linux OpenVINO CI Pipeline,Linux QNN CI Pipeline,MacOS CI Pipeline,orttraining-amd-gpu-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed,onnxruntime-binary-size-checks-ci-pipeline,Big Models,Linux Android Emulator QNN CI Pipeline

‹
1
2
...
13
14
15
16
17
18
19
20
21
22
›