tvm
tvm copied to clipboard
[Bug] Tuning Metal fails: Could not find any valid schedule for task
Expected behavior
Tune retina-face-resnet50-fixed.onnx from this repo:
https://github.com/gmeeker/RetinaFace
This is a fixed size input version of this: https://github.com/discipleofhamilton/RetinaFace
Actual behavior
[Task 30/37] Current/Best: 0.00/ 0.00 GFLOPS | Progress: (27/27) | 552.19 sWARNING:root:Could not find any valid schedule for task Task(func_name=conv2d_nchw_winograd.cuda, args=(('TENSOR', (1, 256, 64, 64), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_winograd.cuda', ('TENSOR', (1, 256, 64, 64), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')). A file containing the errors has been written to /var/folders/bd/rc6mzcg1423fzylm2vd6qsd00000gn/T/tvm_tuning_errors_h172s9it.log.
In the log:
RPCError: Error caught from RPC call: [21:48:30] [...]/src/runtime/metal/metal_module.mm:130: InternalError: Check failed: (state != nil) is false: cannot get state: for function default_function_kernelThread group memory requested is more than MAX allowed
Also, this issue is very frequent on Intel Macs, to the point where Metal targets are slower than CPU.
TVM 0.17.0's Metal timer may have made this more prevalent, but I believe that's irrelevant and earlier versions were just not tuning properly.
Environment
macOS 14.6.1 M1 2020 Mac Mini Intel Mac: 2019 MacBook Pro, AMD 5500M TVM 0.17.0
Steps to reproduce
tvmc tune --target metal --output retina-face-resnet50-autotuner_records.json retina-face-resnet50-fixed.onnx
Triage
- needs-triage
- backend:metal