tvm
tvm copied to clipboard
[SME] Add scalable fp16->fp32 dense schedule
This commit extends the functionality of the SME dense and matmul schedules to support operations with fp16 inputs and an fp32 output, where transpose_a=False and transpose_b=True.
For convenience, it also adds a utility called get_vscale_factor which creates the correct multiplier for vscale given a data type, reflecting ideas from an early design of the SVE RFC.
~Note: this commit depends on https://github.com/apache/tvm/pull/16921 so also contains the contents of https://github.com/apache/tvm/pull/16921.~
cc @ekalda @Anndrey24 @leandron
@tvm-bot rerun
Thanks @lhutton1 this is merged now!