tiramisu
tiramisu copied to clipboard
Regression issue? Many halide tests slower with tiramisu on Mac, or crashing
The following tests are significantly slower with Tiramisu vs Halide:
- blurxy
- convolution, convolution_layer
- gaussian
- vgg
- warp-affine
Others fails:
- recfilter — segfault
- heat2d, heat3d —crash: out of bounds access
- laplacian —doesn’t build
- optical_flow —crash: out of bounds access
- resize — crash: name not in scope
This is on macbookpro 2018, MacOS Mojave, latest xcode and Homebrew up-to-date.
Thanks for reporting this @mikeseven !
Those that are slower have been recently added and/or are not yet optimized so I would expect that they are slower.
Those that fail also were added recently and are still under development. I will check them though as they are not failing on my machine.
For the slow benchmarks, there is another reason.
AVX2 is disabled by default in Tiramisu, whereas Halide uses AVX2, so all the Tiramisu benchmarks are expected to be at least 10% slower than Halide.
To enable AVX2 in Tiramisu, uncomment the line
https://github.com/Tiramisu-Compiler/tiramisu/blob/45ace2178c2673049cf39367994d18b2ddfe535b/src/tiramisu_codegen_halide.cpp#L3772
and recompile Tiramisu.
This will add the AVX2 feature to our backend code generator (which uses Halide).
Interesting, AVX2 shouldn't be disabled, or maybe as a cmake option. Now, blurxy and vgg are slow: Kernel : Tiramisu ; Halide ; blurxy : 9.934189 ; 3.325723 ; Halide vgg duration: 0.00508851; Tiramisu vgg duration: 0.0171091;
The others failing before still fail.