initial exploration of a benchmark suite
An initial attempt at a useful benchmark suite. The target audience for this at the moment is other fireproof developers.
Goals
- focus on measuring the performance of the implementation, not the browser or network
- eventually provide enough coverage to identify performance some performance regression
- provide a framework for other developers to add tests that quantify performance improvements
Non-goals (for now)
- real-world perf measurements (CDN, cloud, ipfs, etc)
Early design choices
- Use benchmark.js for execution/measurement
- Wanted something aiming for statistically significant measurements without reinventing the wheel
- Pro - Kind of just works, but lots of papercuts
- Cons - Project recently archived on github
- Cons - some async limitations on setup phase, open PRs on the project to improve this, untested by myself
- Attempting to support browser and npm cli execution
Testing approach, and why some things look the way they do
- These are what I consider "micro" benchmarks
- The framework attempts to run your test function in a loop, enough times to make an accurate measurement
- This pattern means it's helpful to have an easy to reuse or copy start state, the baseline from which your test function will do something interesting (that we time). With this in mind, I created the BenchConnector.
BenchConnector
- A connector implementation I introduced for usage in the benchmarks
- The idea was that I wanted to use FP sync to create these copies of databases, but I didn't want to actually set up separate infrastructure to run these tests. So, this is a pure in-memory implementation, which in many ways is a better fit for benchmarking anyway.
- There are 2 basic modes of creating the bench connector:
- with no arguments, you a brand new connector, fresh state no data or metadata yet (until you connect to it)
- with on argument (another connector instance), this will share the SAME data and metadata storage, useful when you want to create a copy of an existing database
- In this PR there are 2 tests using this functionality (one with 5 docs and one with 50). On page load, we create the template 5 and 50 doc databases, and sync them to our bench connector. Then during test execution time, we create brand new databases, and sync from the pre-seeded templates we created earlier.
Known Issues
- I had to renamed the top-level variable for the encrypted-blockstore IIFE build (it was also using Fireproof, which seemed to conflict when I included both in the debug.html app). I don't know if I'm doing something wrong or not, this was the cleanest solution I found.
- I tried to organize the benchmark code to facilitate both browser and npm cli execution, but this makes so many things awkward, I'm inclined to drop it.
- I've added benchmarkj.s and lodash packages to the project, and things work find going via the npm cli execution. However, in the debug.html I'm including from a CDN URL, which it feels like I should be pointing to something local. But I need advice on the right way to solve this, not just another hard-coded copy in the scripts.
- I don't know what to do with/about the pnpm-lock.yaml changes, they seem to be noise, but I have no idea.
More Tests Needed
- documents with long history
- impact of compaction settings
Cool Ideas for the future
- Dump results of test execution in yet-another fireproof instance
- Could build out minimal app to compare against previous runs, etc
- Add ability for us/others to share these via sync
- Have hosted version for people to run themselves, submit data, see visualizations, etc
Current Status
- Works enough to illustrate the direction I'm headed and get feedback from others
- Run it via CLI
- From packages/fireproof run
pnpm bench
- From packages/fireproof run
- Run it via browser (after packages/fireproof
pnpm serve)- Hit localhost:8080 and navigate the bench.html
- Use benchmark.js for execution/measurement
Just FYI: people I trust making the right decisions are using https://github.com/tinylibs/tinybench for benchmarking, which according to them has less issues with async. So it might be worth a look.
Moving this to draft again, as it requires a revist
@mschoch it'd be cool to see this dusted off as part of a validation suite for the next release. I think the big validatiosn we want are about package size and compatiblity, cold start time with a small dataset, and write speed with a 20k record dataset
this could be worth bringing across the line now, perhaps we can run a standard bench across 18 and 19 and see what the differences are in terms of bundle size contribution, cold start time, and write to medium database are. Like a blog post @meghansinnott
I think we can close this @jchris. The approach I took involved a custom connector, and that entire interface has changed, so it would need to be rewritten. I suspect the needs have changed and a different approach may make more sense.