Alexander Efimov
Alexander Efimov
## Goal - Investigate how C++ library affects flash footprint of luci-micro interpreter - Verify that licu-interpreter could run without C++ runtime - Modify luci-micro and/or it's CMakeListst.txt accordingly ##...
## What? I propose to delete https://github.com/Samsung/ONE/tree/master/compiler/mir and related components from source code. ## Why? This is an old part of the project, and I am not sure if someone...
## What Need to implement optimized kernels for GRU and LSTM operations in interpreter for MCU ## Why For now these operations are not optimized, works slow and consumes a...
This is continuation of #5080, but with more specific goals. ### Goal To measure luci-interpreter performance and memory consumption for models delivered with tflite-micro: https://github.com/tensorflow/tflite-micro/tree/main/tensorflow/lite/micro/benchmarks Currently it contains two models:...
Support "trans" epilogue for asymmetrical tensors. Example of crash: M==16, N==64, i.e. z_tri of shape [16, 64]
This PR adds AMD specific passes to triton-opt.
This PR adds support of following pattern in optimize epilogue pass: ->ConvertLayout -> ElementwiseOp -> StoreOp
This PR fixes: - parsing and printing of kWidth attribute for MFMA and WMMA layouts - comment in 06-fused-attention.p tutorial
This PR inroduces OptimizeLDSUsage pass which generalizes LDS optimization, which was part of DecomposeUnsupportedLayouts pass.
This PR enables support of 3d dot and fixes tests in test_core.py