lighthouse Improve handling of redirects, navStart

This came up in the context of a Lighthouse run on WPT (wpt) with 100% for the perf category but failing the load-fast-enough-for-pwa audit. This page loads a very large JS payload, then does client detection and redirects mobile browsers to a very lightweight "unsupported" page.

It turns out with throttling of both devtools and provided (with WPT's throttling), this site has a TTI of 0.6s because in those modes we measure from the last navStart (the redirect to the "unsupported" page) to TTI, not the initial navigation. simulated throttling measures from the first document request, so shows a slow TTI for the page (this is why the WPT result shows both a fast and slow TTI, because load-fast-enough-for-pwa uses lantern TTI if throttling is provided)

There are a few things happening here that need to be handled:

we don't track JS redirects with our finalUrl system. This means that the LH report doesn't reflect the actual URL you end up on, you don't get audit warnings to try out the finalUrl instead of the requestedUrl because the redirect could have been affecting your results, you don't get redirect info in the redirects opportunity, etc

The network protocol messages don't include JS redirect info, so we'll have to look elsewhere for the signal. Initiator info does look correct, however.
Be consistent on a starting point for metrics and timestamps. Currently for devtools and provided we use the last navStart with our frame ID (which in this case is the "unsupported" page), but for lantern we use the the first document requested (for this case the initial heavyweight page that redirects to the "unsupported" one).

We include redirect information in an opportunity, but we non-lantern tries to measure the last page load. We should make a decision if we want to include these sometimes super-slow redirects against the perf metrics. Basically, are these perf metrics for requestedUrl or for finalUrl (assuming we fix finalUrl to account for JS redirects)?
Need to split NetworkAnalyzer.findMainDocument into two methods (or something like that) :) All the lantern stuff uses it without finalUrl, at which point it looks for the first document request in the network records. If you do pass it a finalUrl, however, it finds the finalUrl document even though it's ostensibly the last document request (in this particular case it works out to the same thing, but in the case of server redirects, lantern metrics and audits relying on the main-resource computed artifact will have different main resources)

May 17 '19 21:05 brendankenny

For reference, I found about 5% of the latest HTTP Archive run have a ≥10s difference between the perf category TTI and the load-fast-enough-for-pwa TTI. It's possible there are other issues happening there, but I imagine this is the main one.

The inconsistency between audits in a single LHR is really only an issue for provided uses like WPT, but we really should be consistent with out starting point between the different throttling methods as well (and finalUrl really should be correct :)

cc @slightlyoff who brought up the original bad case

May 17 '19 21:05 brendankenny

Need to fixNetworkAnalyzer.findMainDocument :) All the lantern stuff uses it without finalUrl, at which point it looks for the first document request in the network records. If you do pass it a finalUrl, however, it finds the finalUrl document even though it's ostensibly the last document request

actually, I guess this is kind of on purpose (or at least working as desired) based on what audits are doing. Maybe we should split the method to make it clearer? We almost want a findMainRequestedDocument and findMainFinalDocument :)

May 17 '19 21:05 brendankenny

so we'll have to look elsewhere for the signal

IMO, we should just use evaluateAsync('window.location.href') like we do in DevTools to get the URL to audit. It will likely end up picking up lots more fragments and other junk though fair warning :)

I guess this is kind of on purpose (or at least working as desired) based on what audits are doing

Correct, this is WAI. Though I agree it's confusing and possibly worth splitting :)

Now for the meat....

I've thought for a very long time that Lighthouse should be measuring the cost of these redirects and that the last navStart is not the correct time origin to be using (I even tried to sneak the change into PRs). It has caused several problems for us in the past, and this is just the latest one rearing its head. I'm fairly certain that no reasonable person would want to claim that the login page in the example is "very fast" when it took 8 seconds loading and executing a massive 1.3MB JS bundle just to be able to know where to redirect you.

I will take the blame for our inconsistencies here because all the places we don't use navStart are places I authored, knowing the inconsistency, and just failed to be convincing enough for us to change our behavior elsewhere.

May 17 '19 22:05 patrickhulce

Discussed and we're going to prioritize the 2nd of the 3 items.

Implication: for non-lantern runs the metrics should include the cost of any redirects.

Oct 24 '19 17:10 paulirish