FrameworkBenchmarks
FrameworkBenchmarks copied to clipboard
Randomize Framework order in runs
Currently the benchmarks are run in the same order everytime.
Sometimes a run fails after a number of frameworks were benchmarked, or the run is restarted.
This causes the frameworks starting with a
to have more test runs than frameworks starting with z
.
If the order could be randomized the number of runs would be better distributed.
As someone maintaining a benchmark starts with x
I feel this.
That said IMO a fair way of handling order is to prioritize benches with most recent changes. As for benches that haven't changed for a while I guess the maintainers would care less about continuous run result.
I think that is enough to start one run with a
and the next with z
.
And perhaps we still see some differences in the results.
This run and the next, change the servers, databases, ... so the change it's for all frameworks. Don't depend from the changes in the frameworks. A mature framework need less changes than a young one. Still we can bench in local to test small changes.
What I maintain starts with u
, so I'm heavily biased here, but I would also appreciate this change being implemented.
My concern is not about failures or restarts, as they usually don't happen that often when the environment is stable, but rather about a feedback latency: I mostly use TFB as a measurement tool (and a big shout-out to TE crew for providing that tool), and given a hypothetical performance drop in the ongoing run, I'm left with approx. a day to squeeze a potential fix into the next measurement, and a failure to do so would lead to a feedback latency of two full weeks (every run is approx. a week). Moreover, any dependency bump I do is at least a week (an almost full run) in terms of feedback latency, and 1.5 weeks on average.
Flipping the order between runs (or FWIW randomizing it) would significantly reduce these latencies for me.
The frameworks that stay in the middle have ~3 days to make changes. It's the same if the bench begin with a or in reverse order. The problem is the frameworks that are the last in the run. Please don't randomize, now we almost know when are the result for our framework. But we need to flip the order in any new run !!
Flipping the order each time makes sense to me.
I like the idea of flipping the order. I'm just getting back from vacation and catching up on a bunch of stuff. Let's get the environment stable, and then I think this is easy to do. Will leave this open until we get it in.
After finish the last full run, the next run did not flip the order.
That's because the tfb-startup.sh script runs tfb-shutdown.sh
on startup; the latter is responsible for flipping the order. Is changing the order only after an unsuccessful run by design?
I think the following run was reversed: https://tfb-status.techempower.com/results/3c2e9871-9c2a-4ff3-bc31-620f65da4e74. The “last framework” tested is incorrect though.
That's because the tfb-startup.sh script runs tfb-shutdown.sh on startup; the latter is responsible for flipping the order. Is changing the order only after an unsuccessful run by design?
No, I forgot that we actually run the shutdown script twice after a successful run because it's being called from the startup script as well. The design was supposed to be the exact opposite. I'll have to move it to the startup script and it will just reverse every time a run starts.
@NateBrady23 It looks like now the opposite thing is happening - the order is always reversed, i.e. the implementations starting with Z
run first.
How about adding an option to run tests in the order of their last execution time, from fastest to slowest? 😈
I am pretty sure that that approach would end up being effectively the same as running them in alphabetical order (or in random order at best), at the cost of significant implementation complexity.