t1
t1 copied to clipboard
[buddy] The last step to achieve deep learning model e2e inference in vector repo
Recently buddy-benchmark has implemented the e2e compilation of "EfficientNet-Quant" (see https://github.com/buddy-compiler/buddy-benchmark/tree/main/benchmarks/DeepLearning/Models/EfficientNet-Quant), a typical deep learning quantized model without float point operations involved in the lowering process. Executing it on the current version of vector repo seems like a good fit, except that there are a lot of memref.alloc()
instructions in this model(e.g. "%alloc = memref.alloc() {alignment = 64 : i64} : memref<1x224x224x3xi8>"), which haven't been supported.
In the previous handwritten buddy cases, we used memref.get_global
instead of memref.alloc()
to initialize array intermediate variables, but to replace all "alloc" statements in this model is inefficient and cumbersome. Would it be possible for vector repo to support "alloc" statements in the near future, or are there any workarounds?
Further observation shows that additional support for these three functions(malloc
, memcpy
and memrefCopy
) may be required to execute this model. Here, memrefCopy
is originally implemented in the MLIR shared object file libmlir_runner_utils.so
. @ksco @sequencer
$ readelf -s efficientnet.mlir.o | grep UND
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
6140: 00000000 0 NOTYPE GLOBAL DEFAULT UND malloc
6141: 00000000 0 NOTYPE GLOBAL DEFAULT UND memcpy
6142: 00000000 0 NOTYPE GLOBAL DEFAULT UND memrefCopy