tvm [Backend][Relax] Add Intel GNA backend for NPU support

Intel GNA (Gaussian Neural Accelerator) backend for TVM Relax, designed as a foundation for Intel NPU support. While GNA hardware is present in Intel Core Ultra processors, this backend serves as a stepping stone toward Intel's current NPU path with OpenVINO runtime integration.

Features:

Pattern-based graph partitioning for GNA/NPU-compatible operations
JSON serialization approach enabling seamless NPU migration
Software emulation mode for testing without dedicated hardware
Support for dense/linear, 1D convolution, and ReLU operations
Automatic shape and dtype extraction for optimization
Comprehensive test coverage with CI integration

Supported operations:

Dense/Linear layers (relax.matmul)
1D Convolution (relax.nn.conv1d)
ReLU activation (relax.nn.relu)

This implementation provides a clean, minimal pattern for backend development while preparing the foundation for Intel's recommended NPU acceleration path through TVM's compilation pipeline.

Aug 09 '25 20:08 Aristide021

@Aristide021 Thanks for the PR! A couple of points and questions:

Status of GNA vs NPU
- The upstream GNA repo is archived and marked as not under active management. The OpenVINO docs also note that GNA is being discontinued and recommend using Intel's NPU as the low-power offload path on newer processors. Given that, would it make sense to position this backend as a stepping stone toward NPU (and/or clarify the long-term maintenance plan in the README/code comments)?
- https://github.com/intel/gna
- https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_supported_plugins_GNA.html
CI & Software Emulation Mode
- According to the OpenVINO docs, GNA plugin supports Software Emulation Mode (CPU fallback) when GNA HW isn't present. If we enable that in tests, we could run E2E coverage in our CI.

I also think this backend can serve as a very good example for codegen in Relax. It shows a clean and minimal pattern: partitioning with basic ops, handing off to JSON, and keeping the implementation relatively lightweight. Adding a short HOWTO or developer note ("Writing a minimal Relax backend") that references this code could be very helpful for the community.

cc @tqchen @Hzfengsy @cbalint13

Aug 21 '25 13:08 mshr-h

@Aristide021 Thanks for the PR! A couple of points and questions:

Status of GNA vs NPU

The upstream GNA repo is archived and marked as not under active management. The OpenVINO docs also note that GNA is being discontinued and recommend using Intel's NPU as the low-power offload path on newer processors. Given that, would it make sense to position this backend as a stepping stone toward NPU (and/or clarify the long-term maintenance plan in the README/code comments)?

https://github.com/intel/gna

https://docs.openvino.ai/2023.3/openvino_docs_OV_UG_supported_plugins_GNA.html

CI & Software Emulation Mode

According to the OpenVINO docs, GNA plugin supports Software Emulation Mode (CPU fallback) when GNA HW isn't present. If we enable that in tests, we could run E2E coverage in our CI.

I also think this backend can serve as a very good example for codegen in Relax. It shows a clean and minimal pattern: partitioning with basic ops, handing off to JSON, and keeping the implementation relatively lightweight. Adding a short HOWTO or developer note ("Writing a minimal Relax backend") that references this code could be very helpful for the community.

cc @tqchen @Hzfengsy @cbalint13

Thanks for the review and the excellent points! You're correct about GNA being archived. I designed this backend as a stepping stone toward NPU support with OpenVINO runtime integration in mind. The JSON serialization approach should make the transition to Intel's current NPU path relatively straightforward.

For the CI integration with Software Emulation Mode, I think that's a great suggestion. I can add CPU fallback support to enable E2E testing without requiring actual GNA hardware.

I'd also be happy to add documentation, positioning this as a foundation for NPU backends, and include a developer guide if that would be helpful for the community.

I'll go ahead and update the PR description to clarify the NPU migration path. My next step will be to add CPU emulation support for testing. Please let me know if you have any other suggestions.

Aug 22 '25 18:08 Aristide021

Thanks for the contribution, given GNA is archived, it perhaps does not make sense to maintain it in the main tree, adding ci will also add extra overhead here. However, i agree that having generic tutorials for BYOC NPU would be useful, if we can have something that support a current NPU that would be great

Aug 24 '25 16:08 tqchen

Thanks for the contribution, given GNA is archived, it perhaps does not make sense to maintain it in the main tree, adding ci will also add extra overhead here. However, i agree that having generic tutorials for BYOC NPU would be useful, if we can have something that support a current NPU that would be great

I'd be happy to refactor this into a generic NPU tutorial targeting Intel's current NPU plugin. Should this live in the tutorials section or as a contrib module? I can adapt the JSON architecture for educational purposes.

Aug 24 '25 17:08 Aristide021

i think starting as contrib is fine, and we can have a tutorial explaination point to the code

Aug 24 '25 19:08 tqchen