onnx-mlir icon indicating copy to clipboard operation
onnx-mlir copied to clipboard

[Work in progress] Enabling under a flag compiler generation of Stick

Open AlexandreEichenberger opened this issue 11 months ago • 0 comments

Flag is -enable-compiler-stick-unstick to enable (at this time) sickify only. When using the -parallel flag, it will use several OMP threads.

At this time, the f32 to f16 is efficient, with an inner loop of 8 SIMD f32 to f16 conversion (all load store conversion ops fully simdized). But the writing into the mapped (3DS only at this time, easy to extend) is scalar only (i.e. one load store per f16 value).

AlexandreEichenberger avatar Mar 13 '24 03:03 AlexandreEichenberger