tvm
tvm copied to clipboard
[DLight] Perf improvement for low_batch_gemv on Metal
This PR improves the performance of low_batch_gemv on Metal by changing schedule config. The performance improvement is around 2x when bucket larger than 2.