txiki.js icon indicating copy to clipboard operation
txiki.js copied to clipboard

Startup penalty vs QuickJS/node.js

Open richarddd opened this issue 2 years ago • 7 comments

I'm curious about the startup performance vs QuickJS and even Node.js.

For a super dummy test logging to console I'm getting significantly longer startup times vs QuickJS and Node.js

There is curl and libuv but any idea what might be the cause for the additional latency?

It's 2x longer than node and 23x longer than QuickJS

Btw, thanks for a great library :)

Txiki

time ./tjs -e "console.log(1)"
1
./tjs -e "console.log(1)"  0.10s user 0.01s system 96% cpu 0.115 total

QuickJS

time ./qjs -e "console.log(1)"
1
./qjs -e "console.log(1)"  0.00s user 0.00s system 63% cpu 0.005 total

Node.js

time node -e "console.log(1)"
1
node -e "console.log(1)"  0.04s user 0.01s system 94% cpu 0.054 total

richarddd avatar Jul 25 '22 07:07 richarddd

Hey there! I suspect this would have to do with the fact that the entire bundled JS library is loaded / evaluated at startup.

TBH I have never focused in startup times... yet.

saghul avatar Jul 25 '22 07:07 saghul

Hey there! I suspect this would have to do with the fact that the entire bundled JS library is loaded / evaluated at startup.

TBH I have never focused in startup times... yet.

Hey. Is this something that you would consider optimizing for? Maybe lazy loading of the js library?

richarddd avatar Aug 02 '22 13:08 richarddd

Honestly I'm not sure. I'm open to options as long as the complexity doesn't increase dramatically.

saghul avatar Aug 02 '22 15:08 saghul

I've been working on reducing startup load and have ended up pre-compiling the bundle.js and std.js into QuickJS bytecode and embedding that into bootstrap.c (https://github.com/samuelbrian/txiki.js/commit/3c1539388352b8bc3eb95a3073b4e2f100533f03)

That reduced run time of a console.log("Hello world") script from ~0.51s to ~0.07s on a relatively slow embedded system.

I've also dropped 600kb of bundle.js by changing to a UTF-8-only TextEncoder (https://github.com/samuelbrian/txiki.js/commit/dfc13ce2cdbb13f77eff7bc9da2ba610f394160d) and that gets it down to ~0.03s.

@saghul I'd be happy to refine this if you've got ideas on what you'd like to see for it to be accepted into txiki.js. It's a bit awkward, I ended up with the bytecode files in revision control next to the esbuild-ed .js files because I couldn't work out a good way to build the compiler tool for the host while I'm cross compiling. I'm not very good with CMake.

samuelbrian avatar Sep 14 '22 04:09 samuelbrian

Excellent investigation!

Indeed, I'm not sure about the bundling.

If you look at the commit history you'll see I used to do that with a reduced version of qjsc, but it was unwieldy to use when cross-compiling.

Maybe we could break the build process into 2, a "prepare" step, which generates the bundle and byte code in the host system, then the build itself.

Adding a -c (for compile) to the CLI as a helper útil might allow us to use tjs itself to create the byte code perhaps.

Regarding TextEncoder, I like the change, it aligns well with the "txiki" nature of the project. Can make that a PR already?

Cheers!

saghul avatar Sep 14 '22 07:09 saghul

I love these changes. Regarding the bundling: Can't this be part of the build process? So when compiling txiki you first compile it and then create the bundled bytecode and finally include the bundled bytecode in the boostrap?

richarddd avatar Sep 15 '22 11:09 richarddd

Yes and no. It makes cross-compiling potentially more complex but I think we can simplify it by doing the bundling and byte code generation in the host.

saghul avatar Sep 15 '22 15:09 saghul

I did think about adding -c for compile but I figured it wouldn't be generally useful if tjs cannot run the compiled files, or cannot import the compiled files as modules.

I've been meaning to revisit an automatic two-step build, right now in my own tree I've committed the binaries. It might be a while before I get to it.

samuelbrian avatar Oct 13 '22 04:10 samuelbrian

I fixed this in master! We're now embedding byte code as a C file which is checked in currently. This should also make it easier to cross-compile.

The only missing part I think is support for machines with a different endianness. I think we'd need to generate the byte code swapped.

saghul avatar Dec 07 '22 10:12 saghul

Its now way better 🎉

time qjs -e "console.log(1)"
1
qjs -e "console.log(1)"  0.00s user 0.00s system 37% cpu 0.011 total
time ./tjs eval "console.log(1)"
1
./tjs eval "console.log(1)"  0.02s user 0.01s system 86% cpu 0.033 total

richarddd avatar Dec 08 '22 08:12 richarddd

I reckon there is much speed improvement to gain further. @saghul what do you think about this suggestion:

  • Remove the current bundling of all the core and std libraries
  • Compile every std and core module into separate bytecode .c files.
  • Replace the export from index.js in std and core to lazy load these bytecode files on demand.
  • This could be done by defining a getter on the export which then calls a evaluate bytecode function in c that returns the actual export.

Before:

....
import createHash from './hashlib.js';

export {
    createHash,
    ...
};

After

const exportObject = {}

Object.defineProperty(exportObject, 'createHash', { get() { return loadBytecode("tjs__std_createHash") } });

export default exportObject;

This means we still have a single file but all the bytecode arrays are just loaded into memory and not evaluated until it's actually needed.

richarddd avatar Dec 23 '22 22:12 richarddd

Yeah that would be a way to get extra speed.

I don't want to complicate things more than they need be, since one of the goals of this project is to keep things small and easy.

Thus, if we are "fast enough" I'd rather keep it that way, with your idea in the back pocket, until we need it...

saghul avatar Dec 24 '22 08:12 saghul

The thing is that I'm running txiki in a Serverless environment at scale where every extra ms incurs cost. That's why startup time is very important.

Yes, it will make things a bit more complex but not particularly, it's mainly how the std lib and core lib are exported, bundled and imported. Helper functions will make it easier to do lazy-loaded exports.

I did a very basic test removing all "non-essential" code from run-main and I got 14ms vs 33ms on my machine

Before I invest the time. Would you be open to a PR?

richarddd avatar Dec 26 '22 09:12 richarddd

I'll take a 50% improvement, yes :-)

I'm happy to explore that path indeed.

If you could make a PR we can look at it together and see how it will feel like. If you look at the project history you'll see I've gone back and forth in the bundling strategy, and am not shy of reevaluating.

saghul avatar Dec 26 '22 10:12 saghul

Cool! I also noticed there is a lot of "unused" code being included in the bundle. For instance: Path contains a lot of code for win32 which is not used on mac or Linux. I won't touch that in this PR but we could easily exclude stuff like that using esbuild define and tree shaking. Ideally, lots of similar code could move into c such as path operations, hash functions etc. Generally speaking, things that frequently execute in huge loops (such as hashing functions and text encoding) will suffer if executed in js land and should move to c :)

richarddd avatar Dec 31 '22 10:12 richarddd

Great ideas!

I'm planning to eventually bundle mbedtls, we can likely use the hashing stuff provided there for example.

saghul avatar Dec 31 '22 10:12 saghul

One thing I didn't think about is that ESM does not support exporting of getters. This means that this change will be breaking as there is no way to do import { createHash} from @tjs/std lazily.

There are two alternatives:

Simplest/cleanest approach Build modules individually and use default exports

import createHash from "@tjs/std/hash"

Default exports with getters Create a default exported object with getters

import std from "@tjs"
const {createHash} = std

What's your take on this?

richarddd avatar Jan 01 '23 11:01 richarddd

Hum. I really didn't want to go back to individual modules, but happy to experiment again.

I'd say we should go for @tjs/thing then.

Would this style work for all the current stdlib modules we provide?

saghul avatar Jan 03 '23 21:01 saghul

Yes unfortunately it's forbidden in ESM to do so. Yes, it's possible. I've had great success in experimenting with lazy loading for all modules. I first bundle every file in stdlib, polyfills and core and compile byte code using qjsc. I then generate a precompiled.c with a lookup table for a given module name. In the tjs_module_loader i look for modules in the lookup table and load there bytecode if found, otherwise i proceed as normal. For lazy loading i have implemented a require function that loads the given module and returns its value in a getter. This also allowed me to remove the globalThis.queueMicrotask(() => evalFile(filename)); issue from run-main.js.

The only catch I've found so far is that I need to expose an internal function in QuickJS to require the files. This is the JSValue js_get_module_ns(JSContext *ctx, JSModuleDef *m); function which I have added to quickjs.h

So with all generated c bytecode loaded into memory, I get around 19-20ms vs 36-37ms for the version in master.

richarddd avatar Jan 03 '23 23:01 richarddd

That's exciting!

I think while we're here we probably want to move to the "reserved-ns:library" naming convention that Node and Bun use. For example: "tjs:path" etc.

I'm still AFK for a few days, but if you open an early PR we might be able to start discussing details, but overall I think I'm liking this!

saghul avatar Jan 04 '23 08:01 saghul

Will open a draft PR in the coming days.

THIS also is very exciting: benchmark-txiki

richarddd avatar Jan 04 '23 19:01 richarddd

Awesome!

saghul avatar Jan 04 '23 20:01 saghul

https://github.com/saghul/txiki.js/pull/350 its HUGE one, don't hate me 😶

richarddd avatar Jan 08 '23 12:01 richarddd