How should external tools find emscripten?
There are several external tools that want to be able to run emscripten (usually in order to run emcc/em++, etc). The ones that I know about are:
- cmake : https://github.com/emscripten-core/emscripten/blob/ea3209ae0427a8223e7ff108f84b55d05986b8e1/cmake/Modules/Platform/Emscripten.cmake#L63
- scons : https://github.com/emscripten-core/emscripten/blob/ea3209ae0427a8223e7ff108f84b55d05986b8e1/tools/scons/site_scons/site_tools/emscripten/emscripten.py#L15
- qtcreator: https://github.com/qt-creator/qt-creator/blob/ab36004fdc92ed0bc20d141d61c31237fbbfd8e4/src/plugins/webassembly/webassemblytoolchain.cpp#L65
The method that both scons used to use, and the qtcreator is currently using is to open the ~/.emscripten config file (which is in python format) and somehow parse out the value of EMSCRIPTEN_ROOT and use that.
Note that the EMSCRIPTEN_ROOT key is completely useless for emscripten itself since emcc and all the other tools in emscripten already know where they are when they are run. In fact the presence of this key can be confusing and contradictory (what does it mean if you run emcc and the EMSCRIPTEN_ROOT points to a different place?).
The method that cmake uses is to look for the EMSCRIPTEN environment variable. The scons tools were changed to be purely environment-variable-based a while back: #7249. It now looks for EMSCRIPTEN_ROOT in the environment.
qtcreator looks like its parsing the config file as an ini file, which I guess works in many cases.
As part of #9543 I would like to avoid having external tools try to parse the emscripten config file.
We should pick a single, recommended way to locate emscripten and implement that in all the tools. The options I see are:
- Look for
$EMSCRIPTENto$EMSCRIPTEN_ROOTin the environment. - Look for
emccin the $PATH environment. - Create a new
~/.emscripten_rootfile that contains just a single string which is the emscripten directory.
We could also do (1) followed by (2), which I think is my preferred method.
The only downside of (1) or (2) is that it requires the user's environment variables to be changed.
With emsdk we already ask the user to run ./emsdk_env in order to do this, or to modify their startup files.
I like 1) followed by 2) also; similar to setting CC or CXX it's easy to override when customizing or using an embedded environment, but doesn't need to be set at all when the default command is in the PATH.
The file-based method 3) can't be overridden on a project-by-project basis, and wouldn't get set for each user by system-wide packaging methods.
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.
I think this has still no yet been resolved fully.. tools like qtcreator and others are still a little confused about how to find emscripten correctly.
This issue has been automatically marked as stale because there has been no activity in the past year. It will be closed automatically if no further activity occurs in the next 30 days. Feel free to re-open at any time if this issue is still relevant.
I see this issue still hasn't been resolved. Our little organization has a similar need. We specifically need to locate the path to the webidl_binder tool. May I suggest EMSCRIPTEN_ROOT to be made an official environment variable and have it be exported from emsdk_env. Then advise other distributions to also set it as part of their installers.
@agnickolov we should really put webidl_binder and file_packager into the PATH. But for now you can find webidl_binder but looking up emcc in the PATH. e.g.
emcc = shutil.which('emcc')
webidl_binder = os.path.join(os.path.dirname(emcc), 'tools', 'webidl_binder')
It's a bit more complicated for our use case than that actually. We use vcpkg with Emscripten. Both rely on a CMake toolchain file. In order to get them to work together, we create our own CMake toolchain file that includes both. For that we need to locate the Emscripten toolchain file, hence the need for the Emscripten root. E.g. we need the value of the actual EMSCRIPTEN_ROOT variable, which we reference within our CMake toolchain file. The above suggestion should work in setting it up based on the location of emcc, at least on Unix-style OSes. On Windows (w/o WSL) we'll need to do something slightly different, but this is a starting point. I should also point out that we don't only use emsdk. On Windows we use chocolatey to set up Emscripten. So the solution should work with any Emscripten distribution, not only the official distribution via emsdk. Having EMSCRIPTEN_ROOT as an officially supported environment variable will help tremendously, but we can implement workarounds and set it up ourselves instead.
Can you not derive the emscripten root from looking up emcc in the PATH?
This seems better than ask all emscripten users to not only put emcc in the PATH but also set EMSCRIPTEN_ROOT and importantly it avoids the issue of what do when EMSCRIPTEN_ROOT conflicts with the emcc in the PATH. In the past when we used EMSCRIPETN_ROOT it could be confusing if somebody set EMSCRIPTEN_ROOT to once value but then ran emcc from the different location. Using the PATH as the single source of truth seems like a good solution to me.
Perhaps i'm not understanding something here. How would having EMSCRIPTEN_ROOT help tremendously in your case? Is looking up tools in the PATH hard in some place or system that I'm overlooking?
Unfortunately, this approach doesn't work with other distributions. I'm looking at the Homebrew installation on MacOS in particular, but I imagine others may use similar tricks as well. For starters, emcc on the path is a symbolic link. After following through the link, it leads to a wrapper script in a different folder that sets up Python and invokes the real script. I have no way of discovering the location of the real emcc that way.
I understand you don't officially support other distributions, but in the real world we need to deal with such issues...
I see. That sounds like all the more reason to ship webidl_binder and file_packager alongside emcc (i.e. in the PATH).
Another option could be to add the emscripten dir to the --version output (line clang does with its InstalledDir:
$ clang -v
Debian clang version 16.0.6 (19)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin <-- we could add EmscriptenDir here for emcc --version
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Candidate multilib: x32;@mx32
Selected multilib: .;@m64
Unfortunately, this approach doesn't work with other distributions. I'm looking at the Homebrew installation on MacOS in particular, but I imagine others may use similar tricks as well. For starters, emcc on the path is a symbolic link. After following through the link, it leads to a wrapper script in a different folder that sets up Python and invokes the real script.
Oh interesting. Is the emcc launch script (that target of the symlink) not always in the same directory as emcc.py. i.e. the emscripten root. The emcc script that we ship must always live alongside the emcc.py.. is homebrew shipping a different wrapper script?
It is, but that script is not on the path. A wrapper script is on the path with the following content: #!/bin/bash PYTHON="/usr/local/opt/[email protected]/bin/python3.12" exec "/usr/local/Cellar/emscripten/3.1.59/libexec/emcc" "$@"
This is of course what was generated for my local installation, the versions would differ as packages are updated. This script is located under /usr/local/Cellar/emscripten/3.1.59/bin and a symlink is under /usr/local/bin, from where it's launched, as only that directory is on the path. This is how Homebrew on MacOS packages it at least. I haven't investigated Chocolatey on Windows yet.
I did solve the problem with a rather ugly collection of heuristics in a shell script, but I'm not particularly happy with it. I won't bother you with the details, but it involves traversing the directories at the emcc folder and its parent until I can locate webidl_binder. At least that tool is not available from the path and there is no wrapper script for it either. This works for both emsdk and Homebrew.
I like the suggestion about EmscriptenDir being reported from emcc --version BTW. I can then parse the output and store that into EMSCRIPTEN_ROOT or any other environment variable I need. This will need to be maintained going forward, however, e.g. if the layout changes and the tools are no longer at the root of the installation in the future, this should still report the root.
The reason I need this approach, again, is that I cannot use emcmake and need to supply a custom CMake toolchain to CMake that includes the Emscripten CMake toolchain.
I've been using emcc --cflags to get the sysroot. It prints --sysroot=/opt/homebrew/Cellar/emscripten/3.1.61/libexec/cache/sysroot so the CMake toolchain file can be found like so (at least on Mac and Linux):
# This value will be `/opt/homebrew/Cellar/emscripten/3.1.61/libexec/cache/sysroot` on a Homebrew installation,
# `/root/emsdk/upstream/emscripten/cache/sysroot` on a manual installation.
EMSCRIPTEN_SYSROOT=$(emcc --cflags | grep -o -- "--sysroot=[^ ]*" | cut -d= -f2)
# This value will be `/opt/homebrew/Cellar/emscripten/3.1.61/libexec` on a Homebrew installation,
# `/root/emsdk/upstream/emscripten` on a manual installation.
EMSCRIPTEN_ROOT=$(dirname $(dirname $EMSCRIPTEN_SYSROOT))
CMAKE_TOOLCHAIN_FILE=$EMSCRIPTEN_ROOT/cmake/Modules/Platform/Emscripten.cmake
If that works for your situation then great, but remember that will only work if the cache directory happen to live inside the emscripten directory (as it happens that is the default so should work in most cases, but the cache location is configurable).