wasm-micro-runtime Using indirect call in pre-checker function to avoid the relocation occurrence in XIP mode.

The stack profiler aot_func#xxx will call wrapped function of aot_func_internal#xxx using symbol reference, in some platform, like xtensa, it’s translated into a native long call, which need resolve the indirect address by relocation, this will break the XIP feature which requires the eliminating of relocation. The solution is to change the symbol reference into an indirect call through the lookup table, the code will be like this:

call_wrapped_func:                                ; preds = %stack_bound_check_block
  %func_addr1 = getelementptr inbounds ptr, ptr %func_ptrs_ptr, i32 75
  %func_tmp2 = load ptr, ptr %func_addr1, align 4
  tail call void %func_tmp2(ptr %exec_env)
  ret void

Feb 06 '24 06:02 dongsheng28849455

an alternative for xtensa is to use pc-relative calls. eg. https://github.com/yamt/llvm-project/commit/3f04244a4fd94952a3f208b7d1ee3afef2da1e3a good: more efficient. bad: a hack in llvm. i'm not sure how difficult a proper support would be.

Feb 06 '24 08:02 yamt

an alternative for xtensa is to use pc-relative calls. eg. yamt/llvm-project@3f04244 good: more efficient. bad: a hack in llvm. i'm not sure how difficult a proper support would be.

Yes, patching llvm seems a good way instead of changing the AOT file format, @dongsheng28849455 could we apply the patch to llvm before we build llvm for aot compiler? We may add the patch file under build-script folder (like file lldb_wasm.patch), and change the build_llvm.py to apply the patch?

Feb 06 '24 11:02 wenyongh

an alternative for xtensa is to use pc-relative calls. eg. yamt/llvm-project@3f04244 good: more efficient. bad: a hack in llvm. i'm not sure how difficult a proper support would be.

Yes, patching llvm seems a good way instead of changing the AOT file format, @dongsheng28849455 could we apply the patch to llvm before we build llvm for aot compiler? We may add the patch file under build-script folder (like file lldb_wasm.patch), and change the build_llvm.py to apply the patch?

we have tested with coremark, seems not much impact for performance. This might be a more general solution, I'm not sure if any other target has similar issue? or we have to maintain the patch in future.

Feb 06 '24 11:02 dongsheng28849455

Yes, patching llvm seems a good way instead of changing the AOT file format, @dongsheng28849455 could we apply the patch to llvm before we build llvm for aot compiler? We may add the patch file under build-script folder (like file lldb_wasm.patch), and change the build_llvm.py to apply the patch?

we have tested with coremark, seems not much impact for performance. This might be a more general solution, I'm not sure if any other target has similar issue? or we have to maintain the patch in future.

There isn't similar issue reported in other targets yet, seems that we only need to add it for xtensa. The patch is simple and it should be easy to maintain, normally we only need to update the patch when we upgrade the LLVM version. Otherwise, we have to maintain two different AOT file formats (and not sure whether we should change the aot loader), it is also a little tiresome.

Feb 06 '24 11:02 wenyongh

Yes, patching llvm seems a good way instead of changing the AOT file format, @dongsheng28849455 could we apply the patch to llvm before we build llvm for aot compiler? We may add the patch file under build-script folder (like file lldb_wasm.patch), and change the build_llvm.py to apply the patch?

we have tested with coremark, seems not much impact for performance. This might be a more general solution, I'm not sure if any other target has similar issue? or we have to maintain the patch in future.

There isn't similar issue reported in other targets yet, seems that we only need to add it for xtensa. The patch is simple and it should be easy to maintain, normally we only need to update the patch when we upgrade the LLVM version. Otherwise, we have to maintain two different AOT file formats (and not sure whether we should change the aot loader), it is also a little tiresome.

why we should maintain two AOT format? I think it is the same with normal XIP aot, and do not need change aot loader, just enlarge the the lookup table and follow the aot loading process as before.

Feb 06 '24 11:02 dongsheng28849455

we have tested with coremark, seems not much impact for performance.

can you share the numbers?

Feb 06 '24 11:02 yamt

we have tested with coremark, seems not much impact for performance.

can you share the numbers?

test on esp32 device. about 81.6, with --enable-multi-thread option for wamrc 1.3.1 with this patch. as a comparison, precedingly on wamrc1.1.2, the score is 88, but please note, which is without --enable-multi-thread, the option will more or less reduce the execution speed.

Feb 06 '24 12:02 dongsheng28849455

Yes, patching llvm seems a good way instead of changing the AOT file format, @dongsheng28849455 could we apply the patch to llvm before we build llvm for aot compiler? We may add the patch file under build-script folder (like file lldb_wasm.patch), and change the build_llvm.py to apply the patch?

we have tested with coremark, seems not much impact for performance. This might be a more general solution, I'm not sure if any other target has similar issue? or we have to maintain the patch in future.

There isn't similar issue reported in other targets yet, seems that we only need to add it for xtensa. The patch is simple and it should be easy to maintain, normally we only need to update the patch when we upgrade the LLVM version. Otherwise, we have to maintain two different AOT file formats (and not sure whether we should change the aot loader), it is also a little tiresome.

why we should maintain two AOT format? I think it is the same with normal XIP aot, and do not need change aot loader, just enlarge the the lookup table and follow the aot loading process as before.

It changes the XIP's file format, so there are more differences between the formats of XIP and non-XIP than before. And from the comment in the code, it is only for XIP mode for xtensa, but from the changed code, it is for XIP for all targets, I am not sure whether we should leave the format unchanged for non-xtensa targets since the PR increases the binary size. My opinion is we had better keep the AOT file format as simple as we can to reduce the maintain effort, but it deserves more discussion.

@no1wudi could you help review the PR? Not sure whether the change of AOT file format is good to you since you also use XIP a lot?

Feb 06 '24 13:02 wenyongh

I prefer to enhance LLVM, change the AOT file format will cause extra overhead for ARM/RISC-V.

Feb 07 '24 03:02 no1wudi

LGTM

Feb 27 '24 03:02 dongsheng28849455

wasm-micro-runtime wasm-micro-runtime copied to clipboard

Using indirect call in pre-checker function to avoid the relocation occurrence in XIP mode.

wasm-micro-runtime
wasm-micro-runtime copied to clipboard