wasm-micro-runtime wasm_runtime_start_debug_instance can break the module

wasm_runtime_start_debug_instance uses wasm_runtime_module_malloc to reserve memory for _M. it isn't safe because the in-module heap might have not been initialized properly yet.

Apr 14 '25 01:04 yamt

a simple workaround for those who don't really need expression evaluation:

diff --git a/core/iwasm/libraries/debug-engine/debug_engine.c b/core/iwasm/libraries/debug-engine/debug_engine.c
index 0ffc78ad..1b81a41c 100644
--- a/core/iwasm/libraries/debug-engine/debug_engine.c
+++ b/core/iwasm/libraries/debug-engine/debug_engine.c
@@ -408,6 +408,8 @@ wasm_debug_instance_create(WASMCluster *cluster, int32 port)
      * the allocation failed, the debugger will not be able to evaluate
      * expressions */
     instance->exec_mem_info.size = DEBUG_EXECUTION_MEMORY_SIZE;
+    instance->exec_mem_info.start_offset = 0;
+#if 0
     instance->exec_mem_info.start_offset = wasm_runtime_module_malloc(
         module_inst, (uint64)instance->exec_mem_info.size, NULL);
     if (instance->exec_mem_info.start_offset == 0) {
@@ -417,6 +419,7 @@ wasm_debug_instance_create(WASMCluster *cluster, int32 port)
             "Will not be able to evaluate expressions during "
             "debugging");
     }
+#endif
     instance->exec_mem_info.current_pos = instance->exec_mem_info.start_offset;
 
     if (!wasm_debug_control_thread_create(instance, port)) {

Apr 14 '25 02:04 yamt

it isn't safe because the in-module heap might have not been initialized properly yet.

like calling wasm_runtime_start_debug_instance_with_port() before wasm_runtime_instantiate()?

If so, the first argument, exec_env, should be invalid. maybe we can count on that?

Apr 16 '25 02:04 lum1n0us

it isn't safe because the in-module heap might have not been initialized properly yet.

like calling wasm_runtime_start_debug_instance_with_port() before wasm_runtime_instantiate()?

no.

eg. call wasm_runtime_start_debug_instance_with_port() before wasm_application_execute_main().

Apr 16 '25 02:04 yamt

not entirely clear on this. Could you provide us with more details? As we understand it, after wasm_runtime_instantiate(), the linear memory, including the heap area, has been initialized, and if it's a wasm32-wasip1, all ctor() has also been executed(by execute_post_instantiate_functions()).

This would be unless there are exported wasm functions that will call memory.grow()?

Apr 17 '25 07:04 lum1n0us

not entirely clear on this. Could you provide us with more details? As we understand it, after wasm_runtime_instantiate(), the linear memory, including the heap area, has been initialized, and if it's a wasm32-wasip1, all ctor() has also been executed(by execute_post_instantiate_functions()).

no. for wasi commands, the constructors are called by _start.

This would be unless there are exported wasm functions that will call memory.grow()?

i'm not sure what you mean here.

Apr 17 '25 07:04 yamt

Just my wide guess. Please provide us with more details?

Apr 17 '25 07:04 lum1n0us

malloc/free are not expected to be usable before _start.

well, of course, it's possible to implement malloc as it can be used before _start. however, we should not assume such implementations in general.

Apr 17 '25 07:04 yamt

then, maybe we should patch

https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/core/iwasm/interpreter/wasm_runtime.c#L1597C1-L1602C7


    ) {
        call_ctors_func =
            lookup_post_instantiate_func(module_inst, "__wasm_call_ctors"); 
    }

call _start instead of __wasm_call_ctors

Apr 17 '25 07:04 lum1n0us

then, maybe we should patch

https://github.com/bytecodealliance/wasm-micro-runtime/blob/main/core/iwasm/interpreter/wasm_runtime.c#L1597C1-L1602C7
) {
    call_ctors_func =
        lookup_post_instantiate_func(module_inst, "__wasm_call_ctors"); 
}
call _start instead of __wasm_call_ctors

__wasm_call_ctors export is a legacy thing, which is not usually used these days. _start is called by wasm_application_execute_main.

Apr 17 '25 08:04 yamt

if in-module heap preparation is about below in _start() ,

// in void _start(void) 
#ifdef _REENTRANT
    __wasi_init_tp();
#endif

    // The linker synthesizes this to call constructors.
    __wasm_call_ctors();

    // goes to __main_void();

and seems they are only called from _start(). Even another export function will not execute them.

So if users want to execute an exported function, like via iwasm -f foo, the __wasm_call_ctors() won't be executed, because _start() is passed (correct me if I am wrong). Is that correct when we believe __wasm_call_ctors() is important?

Apr 17 '25 09:04 lum1n0us

if in-module heap preparation is about below in _start() ,

// in void _start(void) #ifdef _REENTRANT __wasi_init_tp(); #endif
// The linker synthesizes this to call constructors.
__wasm_call_ctors();

// goes to __main_void();
and seems they are only called from _start(). Even another export function will not execute them.

So if users want to execute an exported function, like via iwasm -f foo, the __wasm_call_ctors() won't be executed, because _start() is passed (correct me if I am wrong).

my understanding is that extra exported functions are not expected to be called before or after _start. they can be called during the execution of _start. (eg. host functions called by _start, like wasi, can call them back)

Is that correct when we believe __wasm_call_ctors() is important?

usually, if a module exports _start, it doesn't export __wasm_call_ctors.

i guess there were historical conventions using exported __wasm_call_ctors. but i'm not familiar with them. i suspect our logic in execute_post_instantiate_functions was for such legacy conventions. wasip1 commands don't export __wasm_call_ctors.

Apr 18 '25 03:04 yamt

they can be called during the execution of _start. (eg. host functions called by _start, like wasi, can call them back)

May I ask how?

it isn't safe because the in-module heap might have not been initialized properly yet.

It seems there isn't a good method to permit the host to call wasm_malloc() after the in-module heap has been properly initialized by _start(). (This issue might be lessened if dealing with a reactor .wasm module)

Apr 22 '25 03:04 lum1n0us

they can be called during the execution of _start. (eg. host functions called by _start, like wasi, can call them back)

May I ask how?

wasip1 itself doesn't have such an api.

wasi-threads uses wasi_thread_start export for thread entry point.

it isn't safe because the in-module heap might have not been initialized properly yet.

It seems there isn't a good method to permit the host to call wasm_malloc() after the in-module heap has been properly initialized by _start(). (This issue might be lessened if dealing with a reactor .wasm module)

a reactor doesn't have the same problem as we call its entry point _initialize during instantiation.

Apr 22 '25 03:04 yamt

anyway, currently this isn't high-proirity (for me) because the default malloc implementation of wasi-libc seems to be safe to be used before _start. but it might hurt you eg. when you update libc.

Apr 22 '25 03:04 yamt

Certainly. I will bear this in mind and continue to monitor any upcoming changes to wasi-libc.

Apr 22 '25 04:04 lum1n0us