tvm [Hexagon] Support template-free meta schedule tuning

[Hexagon] Support template-free meta schedule tuning

Open masahi opened this issue 3 years ago • 1 comments

Building on https://github.com/apache/tvm/pull/12845, this PR adds an initial support for template-free auto tuning on Hexagon.

Test cases demonstrate:

Auto-scheduler style, template free tuning for fp16 conv2d in NHWC layout.
vrmpy auto tensorization for TE int8 dense (weight pre-packed), achieving 440 GOPs on SD888.

Known issues:

Due to the issue explained in https://github.com/apache/tvm/pull/12706, link-params = True, required by Hexagon, causes identical workloads to be tuned as distinct tasks. So e2d tuning is very slow without the changes from 12706.
Tuning nn.dense essentially requires metaschedule RewriteLayout postproc: I found that the memory access pattern of nn.dense, C[i, j] += A[i, k] * B[j, k], where the j axis is vectorized, performs terribly on Hexagon. But the implementation of RewriteLayout is completely incompatible with link-params = True. Until we fix this, we cannot enable RewriteLayout for Hexagon and hence tuning nn.dense (and nn.batch_matmul) is not supported for now.

cc @kparzysz-quic @junrushao @tmoreau89

Sep 21 '22 07:09 masahi

@tvm-bot rerun

Sep 21 '22 11:09 masahi

Thanks @masahi @kparzysz-quic, the PR has been merged!

Oct 03 '22 13:10 tmoreau89