Xulin Zhou
Xulin Zhou
After testing based on the latest repository, the two errors mentioned above still exist and present messages in the following way: 1. Changing `%mask64 = arith.constant dense : vector` to...
After retesting, it is verified that: 1. When setting various lmul, scenarios with higher workload between load&store (e.g. Conv2d) have smaller performance gaps than situations with simple AX plus Y....
The retest's results seem promising. The procedures now run faster than before and the gap between the total cycles of different lmuls has been narrowed (except for the case of...
Further observation shows that the generated `sdiv` operation in this demo is of the i64 type, because the array index in MLIR is represented by `index` and will automatically be...
Further observation shows that additional support for these three functions(`malloc`, `memcpy` and `memrefCopy`) may be required to execute this model. Here, `memrefCopy` is originally implemented in the MLIR shared object...