nncf Fix onnx Attention and torch SDPA quantization handling

Changes

added ONNXAttentionMetatype for the opset 23 Attention ONNX node
fixed scaled_dot_product_attention quantization in torch2 for the case when Q, K and V are parallel edges coming from the same input node

Reason for changes

See #3750

Related tickets

Fixes #3750

Tests

tests/onnx/quantization/test_graphs.py::test_synthetic_models_graph[AttentionModel]
tests/torch2/function_hook/quantization/test_quantized_graphs.py::test_quantized_graphs[unbind_scaled_dot_product_attention_model]

Nov 21 '25 08:11 ruro

Hm. iirc onnx added support for opset 23 in version 1.18.0. So the new test is currently failing in CI due to

onnx==1.17.0; python_version < '3.13'
onnx==1.18.0; python_version >= '3.13'

Do you have any preferences if I should mark this test as

@pytest.mark.skipif(
    version.parse(onnx.__version__) < version.parse("1.18.0"),
    reason="Opset 23 was added in onnx 1.18.0",
)

or bump the version or something else?

Nov 21 '25 10:11 ruro

Hm. iirc onnx added support for opset 23 in version 1.18.0. So the new test is currently failing in CI due to
onnx==1.17.0; python_version < '3.13'
onnx==1.18.0; python_version >= '3.13'
Do you have any preferences if I should mark this test as
@pytest.mark.skipif(
    version.parse(onnx.__version__) < version.parse("1.18.0"),
    reason="Opset 23 was added in onnx 1.18.0",
)
or bump the version or something else?

Hi @ruro, thanks for your contribution. We currently support multiple versions of ONNX, and the Attention operator was added in opset 23, which corresponds to ONNX 1.18.0. I believe we should run this test only for ONNX versions >= 1.18.0.

Nov 25 '25 09:11 andrey-churkin