Follow-up on "Support multiple workers for NODEFS /wordpress mounts"
We are in the midst of merging a number of large PRs:
- https://github.com/WordPress/wordpress-playground/pull/2231
- https://github.com/WordPress/wordpress-playground/pull/2248
- https://github.com/WordPress/wordpress-playground/pull/2281
In order to keep things moving, we merged #2231 while it still had some rough edges. This is an issue to follow up on those:
Higher priority:
- [x] Make fcntl and flock overrides conditional on existence of
PHPLoader.fileLockManager - [x] Complete multi-worker support with Asyncify
- [x] Implement the two missing file-lock-manager test cases
- [x] Make sure error logs are collected from the correct worker and/or consolidated when needed.
- [x] Find way to stop skipping run-cli tests
- [ ] Fix Asyncify sqlite tests so we can stop skipping them
- [ ] Run performance test of multi-worker setup
- [x] Confirm safety of node lookup in
locking.is_path_to_shared_fs(). Do not assume successful lookup. - [x] Look at the getpid() override and the proc_open() process ID management and make sure there are no conflicts between them.
- [x] Make sure
get_vfs_path_from_fd()doesn't abuse or mistakenly use a native/procdir if exposed byuseHostFilesystem(). - [x] Return an error from
get_native_path_from_vfs_path()if no native path is found and make sure callers handle that. - [ ] Reconcile js_getpid() with facilities we ship in the base emscripten js library. If the two don't have much overlap, let's at least acknowledge the other one in a comment. There may be some overlap, though, as proc_open can spawn a php process.
Easy:
- [x] Remove unnecessary nativeNodeModulesPlugin from php-wasm/node/build.js
- [x] Should --experimentalMultiWorker be a hidden CLI option?
- [x] Consider using
bootRequestHandler()instead ofbootWordPress()after booting the first worker. - [ ] For node-es-module-loader
?urlimport handling, grep by /@fs and see if there's any impact on matching of vite paths with .href vs .pathname. If not, we're good.
Nice to have:
- [ ] Support sharing flock-based locks with forked processes.
- [ ] Find way to use normal test setup for test-built-npm-packages
- [ ] Consider alternate load balancing approach, perhaps "a moving average of total idle time per minute as a decent proxy metric"
- [ ] Remove duplicates from ASYNCIFY_IMPORTS lists
- [ ] Drop topOfStack testing from php-asyncify-sqlite3.spec.ts
- [ ] Use native fcntl for fcntl-style locks.
- [ ] Look at unifying multi-worker management with PHP process management. We don't want to have to be aware of where or how PHP is running. Consider whether php-fpm or something like it is appropriate.
Relevant but out of scope
- [ ] Consider how to stop GH actions killing tests without using so many runners to split unit tests
Complete multi-worker support with Asyncify
This is the most important item here
This is the most important item here
@adamziel 👍 I've been working on this today.
There were some functions missing from the ASYNCIFY_IMPORTS list, but there is a "null or function signature mismatch" error related to fd_close.
This evening, I fought with the DWARF path mapping for the /emsdk/emscripten path to enable debugging the libc portions of the stack. With that work, other emscripten DWARF paths were mapped correctly, but the /emsdk path is not affected. It might be because the info is from a prebuilt library like libz. 🤔
Either way, this is not required for this troubleshooting. It just looked like a quick win, and it would be good to be able to step into all source code for a debug build.
In the morning, I'll post a draft PR with the work so far and focus on addressing the signature mismatch error.
Make sure error logs are collected from the correct worker and/or consolidated when needed.
Blueprints v2 PR solves the bulk of that problem. Perhaps we should extract that part of it into a separate PR.
Planning to review this list in the morning and narrow it to the really important things.
Confirm safety of node lookup in locking.is_path_to_shared_fs(). Do not assume successful lookup.
Now we are using noent_okay: true when doing the lookup, so I think this TODO can be considered done.
const { node: backingNode } = backingFs.lookupPath(
nodePath,
{ noent_okay: true }
);
Look at the getpid() override and the proc_open() process ID management and make sure there are no conflicts between them.
It's possible for a PHPWASM instance and a separate child process to have the same PID. This is not ideal, but I think it will be OK in practice because they are used for different things:
- The PID for child processes is used to track child process status, and AFAICT, it is an external tracking PID because the child process is not aware of this PID.
- The PID for a PHPWASM instance is known by the PHPWASM instance and is used to track locked files.
I think we will be OK here.
For the future, my hope is that we will be able to tie everything together with a single kernel instance that explicitly manages PID, locks, and even open file descriptors in a unified way.
Make sure get_vfs_path_from_fd() doesn't abuse or mistakenly use a native /proc dir if exposed by useHostFilesystem().
useHostFilesystem() now explicitly avoids using a host's /proc dir.
Make sure error logs are collected from the correct worker and/or consolidated when needed.
Now that we are using a shared NODEFS-based filesystem for Playground CLI, PHP should be writing to the same real error log file.