Avoid installing test dependencies in system python path. NFC
Instead we use pip to install the dev dependencies in out/python_deps directory.
This means that the emscripten compiler itself does not have access to them,
only the test code which explictly adds this path.
This means we can perform the installation as part of the bootstrap script (just like we already do for node dev dependencies) which in turn means we can consistently rely on dev dependencies to be available in test code (without needing to include an opt out mechanism).
Previously I have been able to test end user setup by leaving out installing the dev dependencies.
But because bootstrap is mandatory, and after this PR it will install python dev dependencies, it looks like I will need to develop some kind of delete step to remove the dev dependency packages from Python for testing and shipping.
This would be a divergence between node.js vs python, where we don't install node.js dev dependency packages via boostrap either?
Also, on Linux e.g. on Debian where pip install is forbidden system-wide, won't Emscripten stop working for end users unless one operates Emscripten from inside a python virtualenv sandbox?
Previously I have been able to test end user setup by leaving out installing the dev dependencies.
I designed this change specifically with your workflow in mind.
IIRC the reason you didn't want to install the dev (test) dependencies was that you were worried about emscripten accidenatlly depending on these. With this system that risk has been removed. The emscripten compiler code itself cannot see or use these dependencies.
But because bootstrap is mandatory, and after this PR it will install python dev dependencies, it looks like I will need to develop some kind of delete step to remove the dev dependency packages from Python for testing and shipping.
This would be a divergence between node.js vs python, where we don't install node.js dev dependency packages via boostrap either?
The bootstrap script does install the node dev dependencies. The place were with using --omit-dev to avoid dev node modules being installed is in tools/install.py.
Also, on Linux e.g. on Debian where
pip installis forbidden system-wide, won't Emscripten stop working for end users unless one operates Emscripten from inside a python virtualenv sandbox?
Indeed, and we are not installing anything system-wide install. Everything is being put in emscripten/out.
That is another advantage of this PR is that now we don't depend on installing anything in the system python path.
I re-titled, and updated the description to be a little bit more explicit about the intent here.
Anther potential upside here is that we add new test dependencies without effecting the distributed version of emscripten, or introducing new dependencies for our users.
For example, we could starting using new packages in our python test code such that could improve the testing experience. Right now we are artificially limited in the packages we can use in our test framework because we want to avoid new product dependencies. This change makes a clear separation and allows us to move forward without risking new deps for our users.
I designed this change specifically with your workflow in mind.
Thanks, that is much appreciated.
With this system that risk has been removed. The emscripten compiler code itself cannot see or use these dependencies.
I suppose the question I have is - how do we test and verify that this will remain to be the case? Since it is not possible to run tests without running bootstrap, then it is no longer possible to launch the test runner in a mode that does not have these dependencies installed?
What I would like is to keep running the Emscripten test suite in a mode that does not require dev dependencies. Those tests prove that Emscripten does not depend on unwanted packages.
I.e. even if we would statically say that Emscripten does not depend on dev packages, it would not automatically mean that we would be testing to verify that it does not depend on those packages? So we wouldn't/couldn't catch an error if that assumption regresses?
The bootstrap script does install the node dev dependencies.
That does not (fortunately) seem to be the case. Or at least if one is installing via emsdk. It is important for us to be able to do a bootstrap that does not install dev dependencies.
C:\emsdk\emscripten\main>npm ci --production
npm warn config production Use `--omit=dev` instead.
npm warn deprecated [email protected]: This module is not supported, and leaks memory. Do not use it. Check out lru-cache if you want a good and tested way to coalesce async requests by a key value, which is much more comprehensive and powerful.
npm warn deprecated [email protected]: Glob versions prior to v9 are no longer supported
added 215 packages, and audited 216 packages in 3s
17 packages are looking for funding
run `npm fund` for details
1 low severity vulnerability
To address all issues, run:
npm audit fix
Run `npm audit` for details.
C:\emsdk\emscripten\main>npm list
main@ C:\emsdk\emscripten\main
+-- @babel/[email protected]
+-- @babel/[email protected]
+-- @babel/[email protected]
+-- UNMET DEPENDENCY @eslint/eslintrc@^3.3.1
+-- UNMET DEPENDENCY @eslint/js@^9.36.0
+-- [email protected]
+-- UNMET DEPENDENCY es-check@^9.4.4
+-- UNMET DEPENDENCY eslint-config-prettier@^10.1.8
+-- UNMET DEPENDENCY eslint@^9.36.0
+-- UNMET DEPENDENCY globals@^16.4.0
+-- [email protected]
+-- [email protected]
+-- UNMET DEPENDENCY prettier@^3.6.2
+-- UNMET DEPENDENCY rollup@^4.52.3
+-- UNMET DEPENDENCY [email protected]
+-- UNMET DEPENDENCY typescript@^5.9.3
+-- UNMET DEPENDENCY vite@^7.1.7
+-- UNMET DEPENDENCY webpack-cli@^6.0.1
+-- UNMET DEPENDENCY webpack@^5.102.0
`-- UNMET DEPENDENCY ws@^8.18.3
npm error code ELSPROBLEMS
npm error missing: @eslint/eslintrc@^3.3.1, required by main@
npm error missing: @eslint/js@^9.36.0, required by main@
npm error missing: es-check@^9.4.4, required by main@
npm error missing: eslint-config-prettier@^10.1.8, required by main@
npm error missing: eslint@^9.36.0, required by main@
npm error missing: globals@^16.4.0, required by main@
npm error missing: prettier@^3.6.2, required by main@
npm error missing: rollup@^4.52.3, required by main@
npm error missing: [email protected], required by main@
npm error missing: typescript@^5.9.3, required by main@
npm error missing: vite@^7.1.7, required by main@
npm error missing: webpack-cli@^6.0.1, required by main@
npm error missing: webpack@^5.102.0, required by main@
npm error missing: ws@^8.18.3, required by main@
npm error A complete log of this run can be found in: C:\Users\clb\AppData\Local\npm-cache\_logs\2025-11-06T21_22_53_946Z-debug-0.log
C:\emsdk\emscripten\main>bootstrap
Up-to-date: npm packages
Up-to-date: create entry points
Up-to-date: git submodules
C:\emsdk\emscripten\main>npm list
main@ C:\emsdk\emscripten\main
+-- @babel/[email protected]
+-- @babel/[email protected]
+-- @babel/[email protected]
+-- UNMET DEPENDENCY @eslint/eslintrc@^3.3.1
+-- UNMET DEPENDENCY @eslint/js@^9.36.0
+-- [email protected]
+-- UNMET DEPENDENCY es-check@^9.4.4
+-- UNMET DEPENDENCY eslint-config-prettier@^10.1.8
+-- UNMET DEPENDENCY eslint@^9.36.0
+-- UNMET DEPENDENCY globals@^16.4.0
+-- [email protected]
+-- [email protected]
+-- UNMET DEPENDENCY prettier@^3.6.2
+-- UNMET DEPENDENCY rollup@^4.52.3
+-- UNMET DEPENDENCY [email protected]
+-- UNMET DEPENDENCY typescript@^5.9.3
+-- UNMET DEPENDENCY vite@^7.1.7
+-- UNMET DEPENDENCY webpack-cli@^6.0.1
+-- UNMET DEPENDENCY webpack@^5.102.0
`-- UNMET DEPENDENCY ws@^8.18.3
npm error code ELSPROBLEMS
npm error missing: @eslint/eslintrc@^3.3.1, required by main@
npm error missing: @eslint/js@^9.36.0, required by main@
npm error missing: es-check@^9.4.4, required by main@
npm error missing: eslint-config-prettier@^10.1.8, required by main@
npm error missing: eslint@^9.36.0, required by main@
npm error missing: globals@^16.4.0, required by main@
npm error missing: prettier@^3.6.2, required by main@
npm error missing: rollup@^4.52.3, required by main@
npm error missing: [email protected], required by main@
npm error missing: typescript@^5.9.3, required by main@
npm error missing: vite@^7.1.7, required by main@
npm error missing: webpack-cli@^6.0.1, required by main@
npm error missing: webpack@^5.102.0, required by main@
npm error missing: ws@^8.18.3, required by main@
npm error A complete log of this run can be found in: C:\Users\clb\AppData\Local\npm-cache\_logs\2025-11-06T21_23_05_251Z-debug-0.log
I suppose the question I have is - how do we test and verify that this will remain to be the case? Since it is not possible to run tests without running bootstrap, then it is no longer possible to launch the test runner in a mode that does not have these dependencies installed?
The idea is that the test runner has a hard dependency on these dev packages, but emscripten itself does not.
I.e. even if we would statically say that Emscripten does not depend on dev packages, it would not automatically mean that we would be testing to verify that it does not depend on those packages? So we wouldn't/couldn't catch an error if that assumption regresses?
The emscripten compiler itself is never run with python_deps in its PYTHON_PATH, so trying to import psutil, for example within the compiler itself would simply fail.
I suppose its possible that someone could defeat that by adding sys.path.append to emcc... but that seems very contrived. If you like can add and some assertion to emcc.py that enforce that this path is not in sys.path? We could even assert that sys.path contains not paths that fall under the emscripten project directory?
I added some assertion to emcc.py to ensure that the compiler itself doesn't run with those python libs in its path.
The idea is that the test runner has a hard dependency on these dev packages
I guess this is the part that I feel off about. Currently there does not exist a hard dependency in the test runner to dev packages, and so I can test the majority of the test suite using the end user python configuration.
If we require a hard dependency to dev packages to be installed when testing, then we lose guarantee that we are testing the end user configuration.
In our packaging process on the CI, what I am currently doing is:
- use emsdk install to install Emscripten. (emsdk runs
bootstrap) - adjust the directory structure of the installed emsdk artifacts a bit to adhere to the Unity directory structure.
- delete some files we don't want to ship to end users (mostly files from the third_party/ directory that have a separate license)
- with emsdk active, run the test suite (ensures that my adjusted dir layout and filtered files still retain a working copy)
- if tests pass, zip up the emsdk directory as the Emscripten SDK artifact to Unity users.
It is beneficial to do this testing without dev dependencies installed. That ensures that the tested Python setup matches the production one.
I suppose I could introduce a second delete pass between steps 4 and 5 to delete the dev dependencies from running Python unit tests (nuke out/ dir). However the thought worries me that the unit test runner installs Python packages before running the tests, so we will have tested a different Python setup that Emscripten uses.
The emscripten compiler itself is never run with python_deps in its PYTHON_PATH, so trying to import psutil, for example within the compiler itself would simply fail.
What I worry is that not all tests start with invoking production emcc. The majority of test checks are run on the test Python side, and not via launching the production Python interpreter. Some test code (the recent binary encode test comes to mind) do not launch emcc at all, so might be running a test only in the context of the dev packages.
Or another way an issue might occur is if test Python has a PYTHON_PATH, that launching a child tool (not necessarily just emcc, but maybe other .py subtool), might inherit the PYTHON_PATH in env, and leak the subprocess launch to use the dev dependencies.
Installing the python dev dependencies to out/python_deps does sound good, then it will demarcate dev packages from the production Python better.. though could we avoid making this installation a hard dependency?
What I worry is that not all tests start with invoking production
emcc. The majority of test checks are run on the test Python side, and not via launching the production Python interpreter.
I'm not sure what you mean by this. 99% of the test run emcc from the outside. i.e. black box testing. That is the only way that any of the test compile anything all. They do things like run_process(EMCC... which launches a completely new instance of python. Not only that, it gets launched by the .bat / .sh launcher scripts, it doesn't even use the current sys.interpreter used to run the tests.
Or another way an issue might occur is if test Python has a PYTHON_PATH, that launching a child tool (not necessarily just emcc, but maybe other .py subtool), might inherit the PYTHON_PATH in env, and leak the subprocess launch to use the dev dependencies.
There are two reason I think this cannot happen:
- We are not setting PYTHONPATH anywhere. Modification to
sys.pathdo not effect PYTHONPATH or subprocesses. - Our
.batand.shlauncher for all our python tools runpython -Ewhich explicitly ignores PYTHONPATH. i..e the compiler itself always run kind of hermetically anyway.
It is beneficial to do this testing without dev dependencies installed. That ensures that the tested Python setup matches the production one.
Its seems like we already have hard dependency on psutil in test code in test/browser_common.py... how does your CI handle that one?
5. if tests pass, zip up the emsdk directory as the Emscripten SDK artifact to Unity users.
I think you certainly want to remove the out/ directory before creating your release
Its seems like we already have hard dependency on
psutilin test code intest/browser_common.py... how does your CI handle that one?
Both pywin32 and psutil are production dependencies as per https://github.com/emscripten-core/emsdk/blob/7b4e60e4bfcba326025e373024369eaa9904af55/scripts/update_python.py#L77-L78 .
This has been the case for a very long time because the emrun tool requires the process scanning to be able to track browser shutdown. (and now more recently we extended that requirement to the parallel browser test harness)
I'm not sure what you mean by this. 99% of the test run
emccfrom the outside
And the remaining 1% are unlabeled, which we don't have a good grasp of which would be subject to a possible regression.
Currently the known discrepancies are explicitly tracked and flagged with the EMTEST_SKIP_PYTHON_DEV_PACKAGES env. var.
By construction, we know that the tests will run on the very same Python setup that goes out to the end users.
The two points you mention, require tacit knowledge. For example, I did not know about the -E flag to provide that safeguard that you mention. So if a refactor to that would go through the cracks in review, nothing would flag this scenario.
It looks for example that this test is not passing the -E flag. If someone refactored os.environ['PYTHONPATH'], it could inherit to that subprocess call. So someone "needs to know" to not do that.
It seems simpler to set up Python the way that end users will have it, and then test that? Then if there are dev packages that add extra, those would be managed with EMTEST_SKIP_PYTHON_DEV_PACKAGES label to be able to see what the discrepancy is.
Both pywin32 and psutil are production dependencies as per https://github.com/emscripten-core/emsdk/blob/7b4e60e4bfcba326025e373024369eaa9904af55/scripts/update_python.py#L77-L78 .
But on linux we won't have python package in emsdk. How does it work on linux? I guess the OS python always supplies this?
The emun usage of psutil is specifically guarded so that it can run on systems without psutil.
By construction, we know that the tests will run on the very same Python setup that goes out to the end users.
But this construction is very new and only exists in your CI. The emscripten CI on github and emsripten-releases bots have always run with dev dependencies installed.
In all the years we have been doing that we have not (as far as I remember) run into an issue where the compiler code ended up accidentally depending on a dev package.
But I agree it is possible, which is why I'm going to all these lengths to make it close to impossible. I think the mitigations in this PR against this happening are very strong.
In summary, I think we have low risk and high level of mitigation.
BTW I hope is that this PR will bring the emscripten CI and emscripten-releases builders closer to your CI, in that they will no longer run dev dependencies visible to the compiler code.
I suppose we can land this change without removing the opt-out, and then we and continue to discuss the opt out separately.
That sounds good, if you have the cycles to slice the PR.