Baiju Meswani
Baiju Meswani
^ Seems like a bug. I will add a pull-request to address this issue.
I addressed the issue you highlighted here: https://github.com/microsoft/onnxruntime/pull/20016 However, there is still another problem that is that the model has a ReduceMax node. ORT training does not have a gradient...
@hbwx24 Thanks for trying out QAT. QAT with ONNX Runtime is in experimental stage at this time. Looking through my own [TODOs in the repository](https://github.com/microsoft/onnxruntime/blob/61610ff9862ad834f153ed3e70ba526dac86ae7c/orttraining/orttraining/training_ops/cpu/quantization/fake_quant.cc#L82), it seems like per channel...
@titaiwangms is this problem resolved? I don't think the mismatch is expected when enable_training is turned on. Are you still seeing this mismatch?
Please share with us your model code if possible?