tvm
tvm copied to clipboard
[Hexagon] Support template-free meta schedule tuning
Building on https://github.com/apache/tvm/pull/12845, this PR adds an initial support for template-free auto tuning on Hexagon.
Test cases demonstrate:
- Auto-scheduler style, template free tuning for fp16 conv2d in NHWC layout.
vrmpyauto tensorization for TE int8dense(weight pre-packed), achieving 440 GOPs on SD888.
Known issues:
- Due to the issue explained in https://github.com/apache/tvm/pull/12706,
link-params = True, required by Hexagon, causes identical workloads to be tuned as distinct tasks. So e2d tuning is very slow without the changes from 12706. - Tuning
nn.denseessentially requires metascheduleRewriteLayoutpostproc: I found that the memory access pattern ofnn.dense,C[i, j] += A[i, k] * B[j, k], where thejaxis is vectorized, performs terribly on Hexagon. But the implementation ofRewriteLayoutis completely incompatible withlink-params = True. Until we fix this, we cannot enableRewriteLayoutfor Hexagon and hence tuningnn.dense(andnn.batch_matmul) is not supported for now.
cc @kparzysz-quic @junrushao @tmoreau89
@tvm-bot rerun
Thanks @masahi @kparzysz-quic, the PR has been merged!