onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

Bugfix for SimplifiedLayerNormalization

Open centwang opened this issue 2 years ago • 0 comments

This PR is to fix https://github.com/microsoft/onnxruntime/issues/12930 and https://github.com/microsoft/onnxruntime/issues/12579.

In detail:

  • For CPU EP, since current impl of SimplifiedLayerNormalization doesn't support input and scale having different data types, so if the sub-graph contains Cast Op, the sub-graph will not fused, this guarantee that both inputs and output data type will be same
  • For CUDA EP, add (fp16, float) support to (T,V) type constraints all combinations of fp16 and float can be supported in the impl

With the fix, the original model can be run with SimplifiedLayerNormalization, which also helps to improve the perf.

centwang avatar Sep 15 '22 08:09 centwang