tiramisu icon indicating copy to clipboard operation
tiramisu copied to clipboard

Regression issue? Many halide tests slower with tiramisu on Mac, or crashing

Open mikeseven opened this issue 6 years ago • 3 comments

The following tests are significantly slower with Tiramisu vs Halide:

  • blurxy
  • convolution, convolution_layer
  • gaussian
  • vgg
  • warp-affine

Others fails:

  • recfilter — segfault
  • heat2d, heat3d —crash: out of bounds access
  • laplacian —doesn’t build
  • optical_flow —crash: out of bounds access
  • resize — crash: name not in scope

This is on macbookpro 2018, MacOS Mojave, latest xcode and Homebrew up-to-date.

mikeseven avatar Nov 18 '18 07:11 mikeseven

Thanks for reporting this @mikeseven !

Those that are slower have been recently added and/or are not yet optimized so I would expect that they are slower.

Those that fail also were added recently and are still under development. I will check them though as they are not failing on my machine.

rbaghdadi avatar Nov 18 '18 09:11 rbaghdadi

For the slow benchmarks, there is another reason.

AVX2 is disabled by default in Tiramisu, whereas Halide uses AVX2, so all the Tiramisu benchmarks are expected to be at least 10% slower than Halide.

To enable AVX2 in Tiramisu, uncomment the line

https://github.com/Tiramisu-Compiler/tiramisu/blob/45ace2178c2673049cf39367994d18b2ddfe535b/src/tiramisu_codegen_halide.cpp#L3772

and recompile Tiramisu.

This will add the AVX2 feature to our backend code generator (which uses Halide).

rbaghdadi avatar Nov 18 '18 10:11 rbaghdadi

Interesting, AVX2 shouldn't be disabled, or maybe as a cmake option. Now, blurxy and vgg are slow: Kernel : Tiramisu ; Halide ; blurxy : 9.934189 ; 3.325723 ; Halide vgg duration: 0.00508851; Tiramisu vgg duration: 0.0171091;

The others failing before still fail.

mikeseven avatar Nov 28 '18 07:11 mikeseven