Baiju Meswani comments

Results 15 comments of


                                            Baiju Meswani

Introduce Training C++ Apis

Hi @mlupetti. Sorry for the late reply. The trainer requires 4 files to perform training: 1. The checkpoint file. 2. The training onnx model. 3. The eval onnx model (optional...

Introduce Training C++ Apis

The goal for this two-step process is to make deploying the training solution easy for on-device training scenarios where the expectation is that the users can generate the files in...

Introduce Training C++ Apis

> FYI: #13215 perhaps, you could hold off on this? ok, will wait for #13215 to merge and update this PR accordingly.

The speed up looks good. Didn't imagine this speed up for elementwise ops. Would something like auto fusing elementwise ops handle this scenario as well, @pengwa? Also, does `QuickGelu` consistently...

Fix include order build failure training build

/azp run On-Device Training Tests

Fix include order build failure training build

/azp run Linux CPU CI Pipeline, Linux CPU Minimal Build E2E CI Pipeline, Linux GPU CI Pipeline, Linux GPU TensorRT CI Pipeline, Linux Nuphar CI Pipeline, Linux OpenVINO CI Pipeline,...

Fix include order build failure training build

/azp run Windows GPU TensorRT CI Pipeline, onnxruntime-binary-size-checks-ci-pipeline, onnxruntime-python-checks-ci-pipeline, orttraining-linux-ci-pipeline, orttraining-linux-gpu-ci-pipeline, orttraining-ortmodule-distributed

torch-ort cannot be installed on windows: onnxruntime-training not found

Yes, as of now, `onnxruntime-training` (`ORTModule`) is not available on Windows. We do plan to change this in the near future. But I don't have a timeline to share with...

[On-device Training] Yolov4 custom loss

Hi there. I provided some suggestions here: https://github.com/microsoft/onnxruntime/issues/19464. The idea being, if the loss is difficult to express in onnxblock, you could try to create an onnx model from pytorch...

[On-device Training] Yolov4 custom loss

Looking at your model after doing shape inferencing on it, I see the concat node like so: ![image](https://github.com/microsoft/onnxruntime/assets/12852605/30dbb45d-0986-4d4d-b568-64c181c7c2dc) The concat node is trying to concat tensors of different types (int64...