node icon indicating copy to clipboard operation
node copied to clipboard

sea: snapshot support in single executable applications

Open joyeecheung opened this issue 2 years ago β€’ 16 comments

Refs: https://github.com/nodejs/single-executable/discussions/57

This patch adds snapshot support to single executable applications. To build a snapshot from the main script when preparing the blob that will be injected into the single executable application, add "useSnapshot": true to the configuration passed to --experimental-sea-config. For example:

{
    "main": "snapshot.js",
    "output": "sea-prep.blob",
    "useSnapshot": true
}

The main script used to build the snapshot must invoke v8.startupSnapshot.setDeserializeMainFunction() to configure the entry point. The generated startup snapshot would be part of the preparation blob and get injected into the final executable.

When the single executable application is launched, instead of running the main script from scratch, Node.js would instead deserialize the snapshot to get to the state initialized during build-time directly.

$ /path/to/node --experimental-sea-config sea-config.json
$ cp /path/to/node sea
$ postject sea NODE_SEA_BLOB sea-prep.blob --sentinel-fuse NODE_SEA_FUSE_fce680ab2cc467b6e072b8b5df1996b2
$ ./sea

joyeecheung avatar Feb 24 '23 18:02 joyeecheung

Review requested:

  • [ ] @nodejs/startup

nodejs-github-bot avatar Feb 24 '23 18:02 nodejs-github-bot

Thanks for doing this!

add some tooling to process the inputs

I personally really like this approach. I've opened a discussion in https://github.com/nodejs/single-executable/discussions/58, so that we can have a decision on this.

RaisinTen avatar Feb 27 '23 13:02 RaisinTen

Rebased using the new SnapshotBuilder::Generate() overload added in https://github.com/nodejs/node/pull/48242 and added a test. Will add more doc and tests next week but the implementation is there.

joyeecheung avatar Jun 10 '23 22:06 joyeecheung

Rebased using the new SnapshotBuilder::Generate() overload added in #48242 and added a test. Will add more doc and tests next week but the implementation is there.

What's the difference between snapshot and code cache (bytenode)?

piranna avatar Jun 11 '23 00:06 piranna

CI: https://ci.nodejs.org/job/node-test-pull-request/52253/

nodejs-github-bot avatar Jun 14 '23 12:06 nodejs-github-bot

What's the difference between snapshot and code cache (bytenode)?

If you have a bunch of top level code with a bunch of inner functions, usually when building the snapshot you run all the top level code and compile some of the inner functions, the snapshot contains the state of the heap after the top-level code is run and the bytecode of the inner functions that would be run again later. When deserializing the snapshot, you deserialize the state of an initialized heap (so there is no need to run the top-level code at all), and maybe later when the application invokes some of the inner functions, V8 deserialize the bytecode in the snapshot during the compilation of the function to speed things up. If there's just the code cache, what usually happens is that you have the bytecode of the top-level code and some inner functions. At runtime V8 deserializes the bytecode in the code cache to speed up compilation of the top-level code, but it still needs to run the top-level code to set up the heap.

joyeecheung avatar Jun 14 '23 13:06 joyeecheung

CI: https://ci.nodejs.org/job/node-test-pull-request/52254/

nodejs-github-bot avatar Jun 14 '23 13:06 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/52256/

nodejs-github-bot avatar Jun 14 '23 14:06 nodejs-github-bot

CI is almost green (with some flakes that look irrelevant). Can I have some reviews please? @nodejs/single-executable thanks!

joyeecheung avatar Jun 14 '23 15:06 joyeecheung

If you have a bunch of top level code with a bunch of inner functions, usually when building the snapshot you run all the top level code and compile some of the inner functions, the snapshot contains the state of the heap after the top-level code is run and the bytecode of the inner functions that would be run again later. When deserializing the snapshot, you deserialize the state of an initialized heap (so there is no need to run the top-level code at all), and maybe later when the application invokes some of the inner functions, V8 deserialize the bytecode in the snapshot during the compilation of the function to speed things up. If there's just the code cache, what usually happens is that you have the bytecode of the top-level code and some inner functions. At runtime V8 deserializes the bytecode in the code cache to speed up compilation of the top-level code, but it still needs to run the top-level code to set up the heap.

Thank you @joyeecheung for your detailed explain :-) So, if I understood it correctly, a snapshoot is a combo of a module compiled and bytecode functions plus heap after executing the module top level code, while the code cache only have the bytecode of all the module code, isn't it?

piranna avatar Jun 14 '23 16:06 piranna

So, if I understood it correctly, a snapshoot is a combo of a module compiled and bytecode functions plus heap after executing the module top level code, while the code cache only have the bytecode of all the module code, isn't it?

Roughly yes. V8 does not always emit bytecode for all the code it encounters (e.g. inner-inner functions are usually just parsed for syntax checks, but no bytecode is generated until they actually get executed). In some cases snapshots may contain more bytecode if during the snapshot building cases these functions are invoked.

joyeecheung avatar Jun 15 '23 08:06 joyeecheung

V8 does not always emit bytecode for all the code it encounters

Maybe bytenode is forcing its generation? Could it of our interest when generating the snapshots, so no Javascript code is included in it?

piranna avatar Jun 15 '23 09:06 piranna

@RaisenTen can you please confirm you are comfortable with this based on https://github.com/nodejs/single-executable/discussions/58 and https://github.com/nodejs/single-executable/discussions/58#discussioncomment-5565539 landing

benjamingr avatar Jun 15 '23 16:06 benjamingr

Could it of our interest when generating the snapshots, so no Javascript code is included in it?

I would say that's not always a win, you could also have a lot of dead code in that application, or code that never gets run in many use cases, then eagerly generating code may result in a memory/size overhead that's not always worth it. As a configurable thing that might be useful though.

joyeecheung avatar Jun 15 '23 17:06 joyeecheung

I can try to take a look day after tomorrow. (I'll be traveling pretty much the entire day tomorrow)

RaisinTen avatar Jun 15 '23 17:06 RaisinTen

As a configurable thing that might be useful though.

Agree on that πŸ‘πŸ»

piranna avatar Jun 15 '23 17:06 piranna

CI: https://ci.nodejs.org/job/node-test-pull-request/52620/

nodejs-github-bot avatar Jul 05 '23 14:07 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/52864/

nodejs-github-bot avatar Jul 20 '23 08:07 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/52872/

nodejs-github-bot avatar Jul 20 '23 14:07 nodejs-github-bot

CI: https://ci.nodejs.org/job/node-test-pull-request/52874/

nodejs-github-bot avatar Jul 20 '23 18:07 nodejs-github-bot

Commit Queue failed
- Loading data for nodejs/node/pull/46824
βœ”  Done loading data for nodejs/node/pull/46824
----------------------------------- PR info ------------------------------------
Title      sea: snapshot support in single executable applications (#46824)
Author     Joyee Cheung  (@joyeecheung)
Branch     joyeecheung:sea-snapshot -> nodejs:main
Labels     c++, lib / src, needs-ci, commit-queue-squash, single-executable
Commits    4
 - src: support snapshot in single executable applications
 - fixup! src: support snapshot in single executable applications
 - fixup! fixup! src: support snapshot in single executable applications
 - fixup! fixup! fixup! src: support snapshot in single executable appli…
Committers 1
 - Joyee Cheung 
PR-URL: https://github.com/nodejs/node/pull/46824
Refs: https://github.com/nodejs/single-executable/discussions/57
Reviewed-By: Benjamin Gruenbaum 
Reviewed-By: Darshan Sen 
------------------------------ Generated metadata ------------------------------
PR-URL: https://github.com/nodejs/node/pull/46824
Refs: https://github.com/nodejs/single-executable/discussions/57
Reviewed-By: Benjamin Gruenbaum 
Reviewed-By: Darshan Sen 
--------------------------------------------------------------------------------
   β„Ή  This PR was created on Fri, 24 Feb 2023 18:28:13 GMT
   βœ”  Approvals: 2
   βœ”  - Benjamin Gruenbaum (@benjamingr) (TSC): https://github.com/nodejs/node/pull/46824#pullrequestreview-1482036842
   βœ”  - Darshan Sen (@RaisinTen) (TSC): https://github.com/nodejs/node/pull/46824#pullrequestreview-1538950503
   ✘  Last GitHub CI failed
   β„Ή  Last Full PR CI on 2023-07-20T18:44:40Z: https://ci.nodejs.org/job/node-test-pull-request/52874/
- Querying data for job/node-test-pull-request/52874/
   βœ”  Last Jenkins CI successful
--------------------------------------------------------------------------------
   βœ”  Aborted `git node land` session in /home/runner/work/node/node/.ncu
https://github.com/nodejs/node/actions/runs/5615590467

nodejs-github-bot avatar Jul 20 '23 20:07 nodejs-github-bot

Landed in ac34e7561ab4771ed1a953efc92dc851ed468e3d

nodejs-github-bot avatar Jul 20 '23 22:07 nodejs-github-bot