dagger TypeScript SDK performance improvement

This issue aims to regroup all my tests and dig into the SDK performances improvement.

Init

Env: compiled a binary from main: (bdce95d0abfa9014a35f5e3962a64afce03d5fca)
Use dagger cli v0.11.0
Use latest dagger cloud

time dagger init --name=perf --sdk=typescript                                              
Initialized module perf in .
dagger init --name=perf --sdk=typescript  0.98s user 0.73s system 11% cpu 14.829 total
➜  perfs git:(main) ✗ time dagger functions                        
Name             Description
container-echo   Returns a container that echoes whatever string argument is provided
grep-dir         Returns lines that match a pattern in the files of the provided Directory
dagger functions  1.32s user 0.78s system 10% cpu 20.765 total
➜  perfs git:(main) ✗ time dagger call container-echo --string-arg "dig into perf" stdout
dig into perf
dagger call container-echo --string-arg "dig into perf" stdout  0.92s user 0.69s system 18% cpu 8.599 total

Operation	Time	Cloud URL (by sha)	Go time (as ref)
Init	14.8	1951a90ea8679fa505b67f640c4a7ce7	2.6s
Functions	20s	7cf74155070bf75952fa5164e3e9667c	3.12s
Call container-echo	8.6s	da94c06b08ba15704f815667a440e44d	2.89s

It's obvious that the Go SDK is much faster than TypeScript. Now that we have the context, let's understand why.

Compares traces

If we compare the TypeScript init trace and the Go init trace.

We can observe a huge difference in the initialization, if we expends the traces, we find something interesting.

The Go withDirectory operation takes less than a second

The Typescript SDK takes 5.2s

This is where we need to optimize the setup.

Note

We also have an extra step where we download the node image, but the speed will be dependant of the network so we cannot really optimize it.

Apr 15 '24 10:04 TomChv

By @jedevc in https://github.com/dagger/dagger/pull/7081#issuecomment-2056371841

Just dumping here for potential avenues for exploration (duplicating from discord):

npm install --package-lock-only step seems to take a lot of time - is there a faster package manager we could use here? I think this is likely only generating the package-lock.json, so I'm not sure what makes this so expensive. Or is this approach even right? It feels like an issue that we wouldn't respect the existing package-lock.json, cc @helderco, I know you looked at the python equivalent in https://github.com/dagger/dagger/pull/7064. We should actually split out these commands to be separate if possible! There's a lot of stuff being done in groups of commands, so traces only show for that chunk, making it hard to dive deeper. Caching doesn't seem to always cache between running init and functions. Not quite sure why - this should be pretty instantaneous. I suspect something in TSX is introducing some latency - is there a way to cache TSX at all? Or maybe we could consider not using TSX at all, and instead compile the typescript into javascript? That feels like it would cache much better potentially, and could move some of the costs upfront.

Apr 15 '24 10:04 TomChv

In the setup, we have this sets of operations that additioned takes more than 3 seconds, what are theses?

We also have 2 seconds dedicated to install tsx

This is something we could improve by changing the package manager (to pnpm maybe?)

Apr 15 '24 10:04 TomChv

This is something we could improve by changing the package manager (to pnpm maybe?)

Yeah, I similarly changed Python's default installer by using a faster one:

https://github.com/dagger/dagger/pull/6884

Apr 15 '24 12:04 helderco

I did some test, trying to switch to pnpm but I keep hitting issues with graphql: https://github.com/pnpm/pnpm/issues/1715

I'm trying another strategie first, seeing if I can get rid of these shell script to do it with dagger operations, maybe it can impact the cache?

Apr 15 '24 14:04 TomChv

I can see that with cached operations, it's going pretty fast so we can do 2 things:

Simplify the caching (the longest operation is, npm install ./sdk which is something that can easily cached)
Reduce the installation time (pre-install the sdk or post install it for example)

Apr 15 '24 15:04 TomChv

I made some tests in order to reduce the time of the dependency installation, it seems I can slightly reduce the time by only installing production dependencies.

I really want to use pnpm, I'll continue to dig into the dependency resolutions, starting by using yarn which may also increase the time.

Note The time are much slower because I changed of place and my internet connection is slower.

Apr 15 '24 18:04 TomChv

It seems I can really decrease the time using yarn and I'm not hitting an issue (almost by 2 globally)

Based on this benchmark, yarn is faster in cold start so it might be a good first solution..

Apr 15 '24 18:04 TomChv

For the latest update about SDK performance improvement, see: https://github.com/dagger/dagger/pull/7096#issuecomment-2123537679

May 22 '24 15:05 TomChv

We discuss about couple of improvement that could be done on the TS SDK to improve runtime performances:

Store the result of the scan into a file during the registration so we don't need to scan it again on the execution -> just load the scanned filed. This would reduce the time of execution because we will not need to recompile the source code etc analysis.
Use a Rust or Go library to get the project AST, it's a path I need to explore but it might be a way to also improve performances

May 29 '24 16:05 TomChv

Quick update on that one, https://github.com/dagger/dagger/pull/7864 could unlock a possible optimization on the setup by removing the sdk install part, it seems that by only using the lockfile, we can download every dependencies. So we would remove half the download work on the setup, which can potentially lead to great improvement.

The support for multiple package manager will also help measuring performances. I'll probably go back on that one soon.

Jul 16 '24 19:07 TomChv