emscripten icon indicating copy to clipboard operation
emscripten copied to clipboard

`-sMAIN_MODULE=2` with `-pthread` misses exports `__emscripten_thread_crashed` and `__embind_initialize_bindings`

Open 8051Enthusiast opened this issue 1 year ago • 19 comments

I'm using (a custom build of) Qt 6.7 with emscripten 3.1.58, with both -pthread and -sMAIN_MODULE=2. When trying to run the build in a webbrowser, I'm getting an exception because __emscripten_thread_crashed is undefined, which is caused by an exception because __embind_initialize_bindings is undefined. Setting -sEXPORTED_FUNCTIONS=_main,__emscripten_thread_crashed,__embind_initialize_bindings works around this (but this wasn't needed on 3.1.55 I think). I'm guessing DCE deletes these because they're only called from javascript?

emcc --version:

emcc (Emscripten gcc/clang-like replacement + linker emulating GNU ld) 3.1.58 (a41843e0860e52c948c1fce20307933c6631c800)
Copyright (C) 2014 the Emscripten authors (see AUTHORS.txt)
This is free and open source software under the MIT license.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Linker command:

/opt/emsdk/upstream/emscripten/em++ -O3 -DNDEBUG -sMAIN_MODULE=2 -sSTACK_SIZE=5MB -pthread -s PTHREAD_POOL_SIZE=4 -s INITIAL_MEMORY=50MB -s MAXIMUM_MEMORY=4GB -s EXPORTED_RUNTIME_METHODS=UTF16ToString,stringToUTF16,JSEvents,specialHTMLTargets,FS,callMain,ENV -s EXPORT_NAME=yphbt_entry -s MAX_WEBGL_VERSION=2 -s FETCH=1 -s WASM_BIGINT=1 -s STACK_SIZE=5MB -s MODULARIZE=1 -s DISABLE_EXCEPTION_CATCHING=1 -pthread -s ALLOW_MEMORY_GROWTH -sASYNCIFY_IMPORTS=qt_asyncify_suspend_js,qt_asyncify_resume_js -s ERROR_ON_UNDEFINED_SYMBOLS=1 @CMakeFiles/yphbt.dir/objects1 -o yphbt.js @CMakeFiles/yphbt.dir/linkLibs.rsp

8051Enthusiast avatar Apr 27 '24 10:04 8051Enthusiast

Looks like a real bug which I imagine was introduced in #21701. I can take a look at this tomorrow.

sbc100 avatar Apr 28 '24 17:04 sbc100

Actually I can't see how this is possible since __emscripten_thread_crashed and __embind_initialize_bindings should be included via https://github.com/emscripten-core/emscripten/blob/0e4c5994eb5b8defd38367a416d0703fd506ad81/tools/link.py#L488 and https://github.com/emscripten-core/emscripten/blob/0e4c5994eb5b8defd38367a416d0703fd506ad81/tools/link.py#L496.

I'm having trouble reproducing as when I build with the above options I see this in the generated JS file:

    var __emscripten_thread_crashed = () =>                                      
        (__emscripten_thread_crashed = wasmExports["_emscripten_thread_crashed"])();

And the wasm file contains that _emscripten_thread_crashed export.

sbc100 avatar Apr 28 '24 17:04 sbc100

Hmm, for me the js file contained the var __emscripten_thread_crashed = ... line, but the exports don't contain it:

Wasm exports
Export[34]:
 - func[21607] <__wasm_call_ctors> -> "__wasm_call_ctors"
 - func[15916] <__wasm_apply_data_relocs> -> "__wasm_apply_data_relocs"
 - func[213]  -> "free"
 - func[13252] <__main_argc_argv> -> "__main_argc_argv"
 - func[250]  -> "malloc"
 - func[15546]  -> "pthread_self"
 - func[15582]  -> "htonl"
 - func[8231]  -> "htons"
 - func[8231]  -> "ntohs"
 - func[15614] <__gettypename> -> "__getTypeName"
 - func[15613] <_emscripten_tls_init> -> "_emscripten_tls_init"
 - func[5573]  -> "emscripten_builtin_memalign"
 - func[15612]  -> "setThrew"
 - func[15611] <_emscripten_tempret_set> -> "_emscripten_tempret_set"
 - func[15610]  -> "emscripten_stack_set_limits"
 - func[15609] <_emscripten_stack_restore> -> "_emscripten_stack_restore"
 - func[15608] <_emscripten_stack_alloc> -> "_emscripten_stack_alloc"
 - func[15607]  -> "emscripten_stack_get_current"
 - func[8262] <__dl_seterr> -> "__dl_seterr"
 - func[15600] <_emscripten_dlsync_self_async> -> "_emscripten_dlsync_self_async"
 - func[5644] <_emscripten_dlsync_self> -> "_emscripten_dlsync_self"
 - func[15598] <_emscripten_proxy_dlsync_async> -> "_emscripten_proxy_dlsync_async"
 - func[15594] <_emscripten_proxy_dlsync> -> "_emscripten_proxy_dlsync"
 - func[15587] <_emscripten_thread_init> -> "_emscripten_thread_init"
 - func[15564] <_emscripten_run_on_main_thread_js> -> "_emscripten_run_on_main_thread_js"
 - func[15549] <_emscripten_thread_free_data> -> "_emscripten_thread_free_data"
 - func[8209] <_emscripten_thread_exit> -> "_emscripten_thread_exit"
 - func[8194] <_emscripten_timeout> -> "_emscripten_timeout"
 - func[15534] <_emscripten_check_mailbox> -> "_emscripten_check_mailbox"
 - func[8171] <__cxa_decrement_exception_refcount> -> "__cxa_increment_exception_refcount"
 - func[8171] <__cxa_decrement_exception_refcount> -> "__cxa_decrement_exception_refcount"
 - func[15507] <__cxa_can_catch> -> "__cxa_can_catch"
 - func[15506] <__cxa_is_pointer_type> -> "__cxa_is_pointer_type"
 - func[15285] <_emscripten_run_callback_on_thread> -> "_emscripten_run_callback_on_thread"

Those lines you highlighted in link.py also most likely executed, since I'm getting the warnings right above them. Should I try to make a more minimal reproducer? For reference, this is the Dockerfile that causes the exception, but it takes around 1-2 hours to build without cache so you probably wouldn't want to build that one.

Edit: Also I forgot to mention, this error does not occur in debug mode

8051Enthusiast avatar Apr 28 '24 19:04 8051Enthusiast

Anything you can do to reduce the side the reproducer would be great. For example, can you try other opt levels (e.g. -O2 or -O1).

You could also try to bisect to figure out if #21701 was really the change that introduced this issue. See https://emscripten.org/docs/contributing/developers_guide.html?highlight=developer#bisecting

sbc100 avatar Apr 29 '24 01:04 sbc100

Ok, so the -sMAIN_MODULE=2 flag was a red herring, the same problem occurs when disabling it. With -O2 it works properly. The following program (still depending on Qt) has the same error:

#include <QThreadPool>
#include <QFuture>
#include <QtConcurrent>

int main() {
	QThreadPool pool;
	QFuture<void> future = QtConcurrent::run(&pool, [](){});
}

As does this one:

#include <QNetworkAccessManager>
#include <QNetworkReply>
int main() {
    auto manager = new QNetworkAccessManager();
    auto request = QNetworkRequest(QUrl("https://example.com"));
    request.setHeader(QNetworkRequest::ContentTypeHeader, "text/plain");
    manager->post(request, "msg");
}

Removing the calls resolves the issue, even if the same Qt libraries are linked with the same build options. Bisecting didn't work as the commit for #21701 didn't include the statx symbol, which was included in the commit right after it. The commit with statx still has the problem though.

8051Enthusiast avatar Apr 29 '24 11:04 8051Enthusiast

Actually, scrap that about -sMAIN_MODULE=2, I don't know if removing it does anything or not because removing it causes the symbols to be minified and I can't grep for __emscripten_thread_crashed anymore. I added a stub statx and #21701 does indeed introduce the problem.

8051Enthusiast avatar Apr 29 '24 11:04 8051Enthusiast

Are there any news here? I am facing the same issue without -sMAIN_MODULE=2. From my tests currently using emsdk 3.1.59 compiling with optimizations -Os, -O3 with pthreads is broken, rest work fine. Compile flags: -O3 -Wall -Wextra -Wundef -Werror -Wno-error=pthreads-mem-growth -sDISABLE_EXCEPTION_CATCHING=0 -pthread Link flags: --bind -sSTACK_SIZE=1MB -sALLOW_MEMORY_GROWTH=1 -sDISABLE_EXCEPTION_CATCHING=0 -sEXCEPTION_STACK_TRACES=1 -sENVIRONMENT=web,worker -sERROR_ON_UNDEFINED_SYMBOLS=1 -sEXPORTED_RUNTIME_METHODS=ccall -sNO_EXIT_RUNTIME=1 -sMAX_WEBGL_VERSION=2 -sMIN_WEBGL_VERSION=1 -sWASM=1 -sTEXTDECODER=0 -lidbfs.js -sWEBSOCKET_URL='wss://' -Wl,--shared-memory,--no-check-features

It seems while the functions are exported they are not visible in the worker context causing them to crash. This is a screenshot from MS Edge console showing one call when the Javascript context is set to top (main thread) and second call inside context em-pthread: image

dsamo avatar May 16 '24 14:05 dsamo

+1

Normally (on 3.1.56) I'd use the following linker flags:

-O3 -pthread -lembind --embind-emit-tsd interface.d.ts -s ENVIRONMENT='web,worker' -s MODULARIZE=1 -s ALLOW_MEMORY_GROWTH=1 -s WASM=1 -s USE_GLFW=3 -s USE_WEBGPU=1 -s NO_FILESYSTEM=1 -s NO_EXIT_RUNTIME=0 -s STANDALONE_WASM=0 -s EXIT_RUNTIME=1 -s ASSERTIONS=1 -s STACK_OVERFLOW_CHECK=2 -s MIN_WEBGL_VERSION=2 -s MAX_WEBGL_VERSION=2 -s DISABLE_EXCEPTION_CATCHING=0 -s PTHREAD_POOL_SIZE=12

I just upgraded to 3.1.59 and added -s STRICT because of https://github.com/emscripten-core/emscripten/issues/20580

I got:

wasm-ld: error: C:\Users\[user]\AppData\Local\Temp\tmpm1_qwmwllibemscripten_js_symbols.so: undefined symbol: _emscripten_run_callback_on_thread. Required by emscripten_set_fullscreenchange_callback_on_thread wasm-ld: error: C:\Users\[user]\AppData\Local\Temp\tmpm1_qwmwllibemscripten_js_symbols.so: undefined symbol: _emscripten_run_callback_on_thread. Required by emscripten_set_wheel_callback_on_thread wasm-ld: error: C:\Users\[user]\AppData\Local\Temp\tmpm1_qwmwllibemscripten_js_symbols.so: undefined symbol: _emscripten_run_callback_on_thread. Required by emscripten_get_element_css_size wasm-ld: error: C:\Users\[user]\AppData\Local\Temp\tmpm1_qwmwllibemscripten_js_symbols.so: undefined symbol: _emscripten_run_callback_on_thread. Required by emscripten_set_resize_callback_on_thread

Switching to -O2 appears to solve the issue, however I then get:

image

Any help with this would be much appreciated.

andreamancuso avatar May 16 '24 18:05 andreamancuso

The _emscripten_run_callback_on_thread undefined symbol can be fixed by adding -lhtml5.js. This is needed because you added -sSTRICT which disables AUTO_JS_LIBRARIES by default.

sbc100 avatar May 16 '24 19:05 sbc100

I ended up solving my problem by adding __emscripten_thread_crashed and __embind_initialize_bindings to the EXPORTED_FUNCTIONS. But this further points out that there is a problem here because this should not be needed. A guess of mine is that DCE still eliminates these functions in higher optimizations because wasmExports['_emscripten_thread_crashed'] and wasmExports['_embind_initialize_bindings'] appear as undefined.

Are there any news here? I am facing the same issue without -sMAIN_MODULE=2. From my tests currently using emsdk 3.1.59 compiling with optimizations -Os, -O3 with pthreads is broken, rest work fine. Compile flags: -O3 -Wall -Wextra -Wundef -Werror -Wno-error=pthreads-mem-growth -sDISABLE_EXCEPTION_CATCHING=0 -pthread Link flags: --bind -sSTACK_SIZE=1MB -sALLOW_MEMORY_GROWTH=1 -sDISABLE_EXCEPTION_CATCHING=0 -sEXCEPTION_STACK_TRACES=1 -sENVIRONMENT=web,worker -sERROR_ON_UNDEFINED_SYMBOLS=1 -sEXPORTED_RUNTIME_METHODS=ccall -sNO_EXIT_RUNTIME=1 -sMAX_WEBGL_VERSION=2 -sMIN_WEBGL_VERSION=1 -sWASM=1 -sTEXTDECODER=0 -lidbfs.js -sWEBSOCKET_URL='wss://' -Wl,--shared-memory,--no-check-features

It seems while the functions are exported they are not visible in the worker context causing them to crash. This is a screenshot from MS Edge console showing one call when the Javascript context is set to top (main thread) and second call inside context em-pthread: image

dsamo avatar May 17 '24 07:05 dsamo

Thank you.

I changed the linker flags to

-O3 -pthread -lembind --embind-emit-tsd interface.d.ts -lhtml5.js -lhtml5_webgl.js -s STRICT -s ENVIRONMENT='web,worker' -s MODULARIZE=1 -s ALLOW_MEMORY_GROWTH=1 -s WASM=1 -s USE_GLFW=3 -s USE_WEBGPU=1 -s NO_FILESYSTEM=1 -s NO_EXIT_RUNTIME=0 -s STANDALONE_WASM=0 -s EXIT_RUNTIME=1 -s ASSERTIONS=1 -s STACK_OVERFLOW_CHECK=2 -s MIN_WEBGL_VERSION=2 -s MAX_WEBGL_VERSION=2 -s DISABLE_EXCEPTION_CATCHING=0 -sPTHREAD_POOL_SIZE=12

Sadly I am still getting:

image

I also tried adding -s EXPORTED_FUNCTIONS=__emscripten_run_callback_on_thread but it didn't make a difference.

andreamancuso avatar May 17 '24 07:05 andreamancuso

You need to also add -lhtml5 (to link against the native version of libhtml5). Alternatively you should add -sAUTO_NATIVE_LIBRARIES.

sbc100 avatar Jun 18 '24 19:06 sbc100

According to my tests with the Godot Web threaded builds (https://github.com/godotengine/godot/issues/94725), wasm-metadce is stripping the _emscripten_thread_crashed symbol (along few others), and this causes the closure compiler to complain (this is how we found out):

shared:DEBUG: successfully executed /media/Storage/emsdk/upstream/bin/wasm-metadce --graph-file=/tmp/emscripten_temp/emcc_dce_graph_bzsx5gw4.json bin/godot.web.template_debug.wasm32.wasm -o bin/godot.web.template_debug.wasm32.wasm -g --mvp-features --enable-threads --enable-bulk-memory --enable-exception-handling --enable-multivalue --enable-mutable-globals --enable-reference-types --enable-sign-ext
building:DEBUG: saving debug copy /tmp/emscripten_temp/emcc-09-wasm-metadce.wasm
building:DEBUG: unused_imports: ['__assert_fail', '__syscall_fstat64']
building:DEBUG: unused_exports: ['_emscripten_run_callback_on_thread', '_emscripten_thread_crashed', 'emscripten_main_runtime_thread_id', 'emscripten_main_thread_process_queued_calls', 'emscripten_webgl_commit_frame']

Sadly, I was not yet able to make a reproducing test using the emscripten testing framework.

Faless avatar Jul 26 '24 15:07 Faless

Any update on this? I'm encountering the same issue

brianpmaher avatar Aug 09 '24 01:08 brianpmaher

@brianpmaher what error are you seeing exactly? i.e. which symbol are you seeing as missing? Can you share the full set of link flags you are using along with the version of emcc?

sbc100 avatar Aug 09 '24 22:08 sbc100

According to my tests with the Godot Web threaded builds (godotengine/godot#94725), wasm-metadce is stripping the _emscripten_thread_crashed symbol (along few others), and this causes the closure compiler to complain (this is how we found out):

shared:DEBUG: successfully executed /media/Storage/emsdk/upstream/bin/wasm-metadce --graph-file=/tmp/emscripten_temp/emcc_dce_graph_bzsx5gw4.json bin/godot.web.template_debug.wasm32.wasm -o bin/godot.web.template_debug.wasm32.wasm -g --mvp-features --enable-threads --enable-bulk-memory --enable-exception-handling --enable-multivalue --enable-mutable-globals --enable-reference-types --enable-sign-ext
building:DEBUG: saving debug copy /tmp/emscripten_temp/emcc-09-wasm-metadce.wasm
building:DEBUG: unused_imports: ['__assert_fail', '__syscall_fstat64']
building:DEBUG: unused_exports: ['_emscripten_run_callback_on_thread', '_emscripten_thread_crashed', 'emscripten_main_runtime_thread_id', 'emscripten_main_thread_process_queued_calls', 'emscripten_webgl_commit_frame']

Sadly, I was not yet able to make a reproducing test using the emscripten testing framework.

Which version of emscripten are you using? I would hope that #21701 might have fixed this, and that change shipped in 3.1.59

sbc100 avatar Aug 09 '24 22:08 sbc100

Which version of emscripten are you using? I would hope that #21701 might have fixed this, and that change shipped in 3.1.59

@sbc100 we found out with 3.1.64, but also confirmed the error in 3.1.63 and 3.1.62 (minimum required version for godot builds with dynlink).

Faless avatar Aug 10 '24 07:08 Faless

@sbc100 one thing to note, is that in Godot's case, we only have the error when building with regular thread builds (i.e not MAIN_MODULE/SIDE_MODULE builds).

I suspect this is because we build dynlink builds MAIN_MODULE=1 and EXPORT_ALL=1, so in that case, meta-dce doesn't strip anything.

Faless avatar Aug 10 '24 10:08 Faless

@sbc100 sorry I should have been more specific.

Emscripten version 3.1.63

I was able to fix the __emscripten_thread_crashed missing symbol, but now getting:

TypeError: wasmPromiseResolve is not a function
    at handleMessage

This is called as wasmPromiseResolve(msgData["wasmModule"]) from the wasm worker code path.

Here are the flags:

-std=c++17 \
-sSTRICT \
-fwasm-exceptions \
-sMIN_WEBGL_VERSION=2 \
-sMAX_WEBGL_VERSION=2 \
-sFORCE_FILESYSTEM=1 \
-sWASM=1 \
-sALLOW_MEMORY_GROWTH=1 \
-sNO_EXIT_RUNTIME=1 \
-sINITIAL_MEMORY=512mb \
-sSTACK_SIZE=64mb \
-sDEFAULT_PTHREAD_STACK_SIZE=65536 \
-sMALLOC=mimalloc \
-sFETCH=1 \
-sMODULARIZE=1 \
-sEXPORT_NAME=App \
-sEXPORT_ES6 \
-pthread \
-sPTHREAD_POOL_SIZE=11 \
-sENVIRONMENT='web,worker' \
-sGL_ENABLE_GET_PROC_ADDRESS \
-sAUTO_NATIVE_LIBRARIES=1 \
-sAUTO_JS_LIBRARIES=1 \
-sDEFAULT_TO_CXX=1 \
-sUSE_GLFW=2 \
-sEXPORTED_FUNCTIONS="['__emscripten_thread_crashed','__embind_initialize_bindings']" \
-sEXPORTED_RUNTIME_METHODS="['wasmMemory','wasmPromiseResolve]" \
-sASSERTIONS

The -sASSERTIONS is temporary while trying to diagnose the issues here

brianpmaher avatar Aug 10 '24 16:08 brianpmaher

@brianpmaher you should not need to add __emscripten_thread_crashed or __embind_initialize_bindings to EXPORTED_FUNCTIONS. Those are internal things that emscripten should absolutely handle itself. Something very odd must be going on with your build if you are finding that you need to explicitly add them.

Perhaps you could share some kind of reproducer?

sbc100 avatar Aug 11 '24 17:08 sbc100

@brianpmaher I was able to reproduce you issue with wasmPromiseResolve is not a function. It was another side effect of building with -sSTRICT. I have a fix in flight but you can do -sINCOMING_MODULE_JS_API=instantaiteWasm in the mean time.

sbc100 avatar Aug 11 '24 17:08 sbc100

Thanks, that worked!

I removed the EXPORTED_FUNCTIONS and that continued with the same error. Added -sINCOMING_MODULE_JS_API=instantiateWasm (note there was a typo in your code block there s/instantaite/instantiate). and that worked for wasmPromiseResolve being defined.

I'm working through another issue with thread pool size not being large enough, but I think that's unrelated now.

brianpmaher avatar Aug 11 '24 21:08 brianpmaher