tvm icon indicating copy to clipboard operation
tvm copied to clipboard

[Bug] Tuning Metal fails: Could not find any valid schedule for task

Open gmeeker opened this issue 1 year ago • 0 comments

Expected behavior

Tune retina-face-resnet50-fixed.onnx from this repo:

https://github.com/gmeeker/RetinaFace

This is a fixed size input version of this: https://github.com/discipleofhamilton/RetinaFace

Actual behavior

[Task 30/37] Current/Best: 0.00/ 0.00 GFLOPS | Progress: (27/27) | 552.19 sWARNING:root:Could not find any valid schedule for task Task(func_name=conv2d_nchw_winograd.cuda, args=(('TENSOR', (1, 256, 64, 64), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32'), kwargs={}, workload=('conv2d_nchw_winograd.cuda', ('TENSOR', (1, 256, 64, 64), 'float32'), ('TENSOR', (256, 256, 3, 3), 'float32'), (1, 1), (1, 1, 1, 1), (1, 1), 'float32')). A file containing the errors has been written to /var/folders/bd/rc6mzcg1423fzylm2vd6qsd00000gn/T/tvm_tuning_errors_h172s9it.log.

In the log:

RPCError: Error caught from RPC call: [21:48:30] [...]/src/runtime/metal/metal_module.mm:130: InternalError: Check failed: (state != nil) is false: cannot get state: for function default_function_kernelThread group memory requested is more than MAX allowed

Also, this issue is very frequent on Intel Macs, to the point where Metal targets are slower than CPU.

TVM 0.17.0's Metal timer may have made this more prevalent, but I believe that's irrelevant and earlier versions were just not tuning properly.

Environment

macOS 14.6.1 M1 2020 Mac Mini Intel Mac: 2019 MacBook Pro, AMD 5500M TVM 0.17.0

Steps to reproduce

tvmc tune --target metal --output retina-face-resnet50-autotuner_records.json retina-face-resnet50-fixed.onnx

Triage

  • needs-triage
  • backend:metal

gmeeker avatar Aug 17 '24 05:08 gmeeker