onnxruntime
onnxruntime copied to clipboard
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
## Problem Statement VitisAI execution provider hangs indefinitely when attempting to compile transformer models (BERT) on AMD Ryzen AI hardware, requiring manual process termination. This affects developer experience and makes...
### Description Fix a critical typo. ### Motivation and Context This misalignment would cause the OGA PR to fail.
### Describe the issue A tiny ONNX model with a single ConvTranspose node produces different results between runs on CPU (WSL). The input is all zeros, weights/bias are constants, yet...
### Description Add start profiling API in ORT. With this, we can profile for a time span. Based on this, we have [another genAI PR](https://github.com/microsoft/onnxruntime-genai/pull/1898) to support start/end profiling in...
### Describe the issue ONNX Runtime 1.19.2 build fails during CMake configure when FetchContent attempts to download the Eigen library. The SHA1 hash of the downloaded archive from GitLab no...
I have an EC2 instance of type g5g.xlarge. I have installed the following: ``` CUDA-Toolit: Cuda compilation tools, release 12.4, V12.4.131 CUDNN Version: 9.6.0 Python: 3.12 Pytorch: Compiled from source...
### Describe the issue Trying to setup ai libraries for training, running into issue with pip install onnxruntime-gpu==1.21 Could not find a version that satisfies the requirement. Likely because its...
### Describe the issue I’m seeing **significantly higher CPU usage** when running model inference in the browser via ONNX Runtime compiled for WebAssembly. While adding threads *does* reduce latency (great!),...
Fixes #26741 This change updates the TypeScript definitions to allow constructing `float16` tensors using `Float16Array` in environments where it is available. Runtime behavior remains unchanged (`float16` is still represented internally...
### Describe the issue ### Describe When running a batch (batch=6, size=[6\*3\*1280\*1280]) FP32 inference on GPU (EP: CUDA Provider) with ONNX Runtime, the majority of the time is spent in...