Add a test for measuring client instantiation time
https://github.com/prisma/prisma/issues/8178
What's our approach to benchmarking on CI (and do we have one)? My concern about having this as a regular test is that it's going to be either extremely flaky or not extremely useful. Even in the referenced issue the reproduction took 4.5 seconds on the reporter's machine and 0.4 (actually like 0.36 when I increased the number of digits in .toFixed(1)) seconds on my machine.
cachegrind + valgrind are the way to benchmark things consistently across devices, not matter the CPU or the RAM on the machine. For instance, if you run cachegrind on [email protected] (on the repro example in the issue linked above):
$ valgrind --tool=cachegrind node src/index.js
==274519== I refs: 1,954,879,183
If you re-run that, you get similar results:
$ valgrind --tool=cachegrind node src/index.js
==276009== I refs: 1,950,136,208
And if you run this on the latest prisma 3:
$ valgrind --tool=cachegrind node src/index.js
==277339== I refs: 994,343,618
And again on prisma 3, but with my CPU at 0.80GHz:
$ valgrind --tool=cachegrind node src/index.js
==277823== I refs: 990,733,662
By looking at the total CPU instructions, it gives us a good way to tap into real performance on CI or any other device consistently. One case where I see this failing is on a machine with very limited RAM (lots of garbage collection). As you can see, there's a bit of jitter because of the Node.js runtime. Considering that, we could check that it never exceeds 1-2% jitter.
https://valgrind.org/docs/manual/cg-manual.html#cg-manual.overview
Folks at SQLite have been using this for years too https://www.sqlite.org/cpu.html
This is amazing, and definitely the way to go π We might even want to profile other metrics in a similar fashion, like the number of IO operations or even syscalls in general (which would have been much more important than the number of instructions in the context of https://github.com/prisma/prisma/issues/8178).
benchmark things consistently across devices, not matter the CPU
FWIW, I don't think it will necessary work consistently across devices. Even if we factor out things like different Node.js versions, different compilers Node.js is built with, different libc versions etc, we can't factor out different CPU architectures: the numbers will be different and won't be comparable.
That said, those differences may be interesting themselves. For example, I get 15% lower number for Prisma 2.26
==698== I refs: 1,662,717,422
and 5% lower number for Prisma 3.3
==1090== I refs: 939,571,062
and... it's really strange? I mean, I would expect the total instructions count to be much higher for a RISC architecture with lots of loads and stores as separate instructions. One possible explanation is that, apparently, JavaScript code is much better optimized by V8 on ARM64 than on x86_64, or so it seems.
Hmmm, that's interesting. I definitely did not account for the CPU architecture, good catch! Luckily, that's something that is consistent on the default GH Actions.