js-framework-benchmark Anomalous "script bootup time" for lit-v2.0.0-rc.1?

We looked into an unexpected regression in "script bootup time" between lit-element-v2.4.0 and lit-v2.0.0-rc.1 that shows up in the (yet unpublished) version of table.html on master, and cannot reproduce it locally.

(because lit-element-v2.4.0 was replaced with lit-v2.0.0-rc.1, below compares against lit-html as a stable baseline)

On master:

Local:

Would it be possible to re-run/re-check the results before publishing?

May 20 '21 17:05 kevinpschaaf

Surely. For the startup benchmark I'm relying on lighthouse and running 4 runs. I just repeated it for lit on my linux razer blade machine and got the following results:

[
  {
    TimeToConsistentlyInteractive: 2182.092,
    ScriptBootUpTtime: 16,
    MainThreadWorkCost: 216.2119999999999,
    TotalKiloByteWeight: 163.5859375
  },
  {
    TimeToConsistentlyInteractive: 2360.6275000000005,
    ScriptBootUpTtime: 73.11599999999999,
    MainThreadWorkCost: 442.18399999999997,
    TotalKiloByteWeight: 163.5859375
  },
  {
    TimeToConsistentlyInteractive: 2182.0650000000005,
    ScriptBootUpTtime: 16,
    MainThreadWorkCost: 245.80799999999994,
    TotalKiloByteWeight: 163.5859375
  },
  {
    TimeToConsistentlyInteractive: 2180.868,
    ScriptBootUpTtime: 16,
    MainThreadWorkCost: 220.63199999999992,
    TotalKiloByteWeight: 163.5859375
  }
]

In this case the run results in a mean of 30.28 and a median of 16 with a std deviation of 29 for the bootup time. So currently I'd say in my run (41.9) there were two slow outliers such that the median was above 16 ms. In your run the std deviation is about 17, so I guess there were three 16ms runs and one outliner.

Can you please post the file webdriver-ts/results/lit-v2.0.0-rc.1-keyed_32_startup-bt.json?

Most frameworks have a 0 std deviation in the script bootup time, notable exceptions with a higher std deviation are lit, most react implementations and most react-redux implementations, This leads to the question why do only some frameworks show a significant std deviation? Are those outliers real or just a wrong measurement?

May 20 '21 18:05 krausest

I tried running lighthouse on the command line and comparing the results: lighthouse http://localhost:8080/frameworks/keyed/lit/index.html --output=json --only-audits=bootup-time | less

The results are not really comparable to what the benchmark driver reports, but also show some noise (between 62 and 112 msecs on my machine). Please note that values below 16 msecs are clamped to 16 msecs.

Without clamping results look like that:

******* result  [
  {
    TimeToConsistentlyInteractive: 2031.44,
    ScriptBootUpTtime: 6,
    MainThreadWorkCost: 214.15599999999995,
    TotalKiloByteWeight: 163.5869140625
  },
  {
    TimeToConsistentlyInteractive: 2180.463,
    ScriptBootUpTtime: 6.359999999999999,
    MainThreadWorkCost: 213.55999999999995,
    TotalKiloByteWeight: 163.5869140625
  },
  {
    TimeToConsistentlyInteractive: 2180.4045,
    ScriptBootUpTtime: 7.9319999999999995,
    MainThreadWorkCost: 221.7479999999999,
    TotalKiloByteWeight: 163.583984375
  },
  {
    TimeToConsistentlyInteractive: 2180.7870000000003,
    ScriptBootUpTtime: 6.5680000000000005,
    MainThreadWorkCost: 218.65199999999993,
    TotalKiloByteWeight: 163.5849609375
  }
]

But also like

[
  {
    TimeToConsistentlyInteractive: 2181.4845,
    ScriptBootUpTtime: 8.395999999999999,
    MainThreadWorkCost: 213.74799999999993,
    TotalKiloByteWeight: 163.5849609375
  },
  {
    TimeToConsistentlyInteractive: 2368.5159999999996,
    ScriptBootUpTtime: 78.04799999999999,
    MainThreadWorkCost: 444.8519999999999,
    TotalKiloByteWeight: 163.5849609375
  },
  {
    TimeToConsistentlyInteractive: 2181.4485,
    ScriptBootUpTtime: 6.728000000000001,
    MainThreadWorkCost: 217.408,
    TotalKiloByteWeight: 163.5869140625
  },
  {
    TimeToConsistentlyInteractive: 2180.1975,
    ScriptBootUpTtime: 7.9159999999999995,
    MainThreadWorkCost: 222.19199999999992,
    TotalKiloByteWeight: 163.5869140625
  }
]

May 20 '21 19:05 krausest

At least temporarily I removed the 16ms cap on script bootup time and rerun all impleentations. This times variance was pretty low for lit: https://krausest.github.io/js-framework-benchmark/current.html .

May 22 '21 13:05 krausest

Sorry for the delay. Here's my webdriver-ts/results/lit-v2.0.0-rc.1-keyed_32_startup-bt.json:

{
  "framework": "lit-v2.0.0-rc.1-keyed",
  "keyed": true,
  "benchmark": "32_startup-bt",
  "type": "startup",
  "min": 16,
  "max": 16,
  "mean": 16,
  "median": 16,
  "geometricMean": 16,
  "standardDeviation": 0,
  "values": [16, 16, 16, 16]
}

Yeah the big 10x outlier(s) are interesting; hard to evaluate where that's coming from.

May 25 '21 16:05 kevinpschaaf

js-framework-benchmark js-framework-benchmark copied to clipboard

Anomalous "script bootup time" for lit-v2.0.0-rc.1?

js-framework-benchmark
js-framework-benchmark copied to clipboard