onnxruntime icon indicating copy to clipboard operation
onnxruntime copied to clipboard

CPU EP implementation of LayerNormalization as Contrib Operator reads values outside memory bound

Open sumitsays opened this issue 3 years ago • 1 comments

Describe the issue

CPU EP implementation of LayerNormalization Contrib Operator does not broadcast the Scale and Bias Tensor and it ends up reading value outside of memory bound. Unfortunately there is no unit test case present in layer_norm_op_test.cc which requires Scale and Bias Tensor to be broadcasted.

To reproduce

Run below test case for LayerNormalization.

TEST(LayerNormTest, LayerNorm_Scale_Bias) { OpTester test("LayerNormalization"); test.AddAttribute("epsilon", 1e-05f); std::vector<int64_t> dims{1, 3, 2}; test.AddAttribute("axis", 1); test.AddInput("x", dims, {1.2416f, 0.946123f, 13.1685f, 0.36423f, 21.145f, 0.03941f}); test.AddInput("gamma", {2}, {-0.6953f, 5.1824f}); test.AddInput("bias", {2}, {0.6435f, -0.3964f}); test.AddOutput("output", dims, {0.420106f, -3.31971f, -0.600539f, -3.69086f, -1.28313f, -3.89804f}); // expected output // actual output // test.AddOutput("output", dims, {0.420106f, -3.31971f, -0.600539f, 1.42324e+18f, -3.68791e+18f, -0f}); test.Run(); }

Note:

  • axis is 1, which says that gamma (Scale) Tensor needs to be broadcasted.
  • Last 3 values are garbage in actual Output.

Command: onnxruntime_test_all.exe --gtest_filter=LayerNormTest.LayerNorm_Scale_Bias

Urgency

No response

Platform

Windows

OS Version

Windows 11 | 10.0.22000 Build 22000

ONNX Runtime Installation

Built from Source

ONNX Runtime Version or Commit ID

1e34440c370d40085fad5fd1b1002b4a04c5991d

ONNX Runtime API

C++

Architecture

X64

Execution Provider

Default CPU

Execution Provider Library Version

No response

sumitsays avatar Sep 09 '22 21:09 sumitsays

I will add a check that the bias/scale are a valid size in a separate PR so there's no overrun, but what's the real-world use case where bias and scale would require broadcasting?

In the PyTorch spec I believe those are learned values that do not require broadcasting.

https://pytorch.org/docs/stable/generated/torch.nn.LayerNorm.html

~LayerNorm.weight – the learnable weights of the module of shape \text{normalized\_shape}normalized_shape when elementwise_affine is set to True. The values are initialized to 1.

~LayerNorm.bias – the learnable bias of the module of shape \text{normalized\_shape}normalized_shape when elementwise_affine is set to True. The values are initialized to 0.

skottmckay avatar Sep 22 '22 23:09 skottmckay