lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

core: add `finalDisplayedUrl`, deprecate `finalUrl`, remove `initialUrl`

Open adamraine opened this issue 3 years ago • 8 comments

Alternative to #13819 that keeps finalUrl on the LHR

Closes https://github.com/GoogleChrome/lighthouse/issues/13706

adamraine avatar Jun 22 '22 22:06 adamraine

(sorry about this...... but I got a hefty pile of pushback on the details here)

In v9 we had just two urls exposed on the LHR. now there's 5. (one of which is the deprecated alias). But still.. 2 -> 4? 😢 I'm looking at the diff to sample-flow-result.json and scratching my head thinking.. do we really need all of these?? I'm also asking the question: "how do we expect our users to make sense of these?"

Why do we have both initialUrl from requestedUrl on the LHR? The only thing i'm seeing here is that on a FR navigation, if there's already a page open when the navigate starts, then it captures the URL we're about to navigate away from. Doesn't seem useful.... (I can imagine why we may want to keep track of these for ourselves, but not in our public API..)

I do not like mainDocumentUrl. :) This name doesn't work because we need to describe at what point this URL was recorded. the "main" url changes over time.

Also... a -1 on finalPageUrl. I don't think it differentiates itself well. i like the "display" vibe that was proposed. The "frame" vibe i also like, but I guess it seems slightly obtuse.

Nothing like a PR to kick the bikeshedding into high gear. :)

paulirish avatar Jul 08 '22 00:07 paulirish

Why do we have both initialUrl from requestedUrl on the LHR? The only thing i'm seeing here is that on a FR navigation, if there's already a page open when the navigate starts, then it captures the URL we're about to navigate away from. Doesn't seem useful.... (I can imagine why we may want to keep track of these for ourselves, but not in our public API..)

initialUrl is the most useless of the bunch, and this is a good argument for getting rid of it. However, I want to point out that in timespan and snapshot mode, there are only 2 URLs (initialUrl and finalNameTBDUrl).

For timespans in particular, initialUrl is useful conceptually because the end url doesn't necessarily represent every page lighthouse visited via vscode. This is particularly useful right now, but I could see us using initialUrl in some cosmetic ways in the future.

I do not like mainDocumentUrl. :) This name doesn't work because we need to describe at what point this URL was recorded. the "main" url changes over time.

Also... a -1 on finalPageUrl. I don't think it differentiates itself well. i like the "display" vibe that was proposed. The "frame" vibe i also like, but I guess it seems slightly obtuse.

Adding a needs discussion, we may need to take a vote on the names. FWIW I'm fine changing any of the names at this point, it should be an easy find and replace.

adamraine avatar Jul 08 '22 00:07 adamraine

However, I want to point out that in timespan and snapshot mode, there are only 2 URLs

Or there's 4 of them, and two have undefined as a value. 😛 Depends on... things.

If we used something like startingUrl, would that unite the two ideas?

Adding a needs discussion, we may need to take a vote on the names.

yup. good call

paulirish avatar Jul 08 '22 00:07 paulirish

Just popping up to say that a single string value mainDocumentUrl is misleading due to redirect chains. I'd kinda prefer that redirects are captured, because that tells an important story. 🚴‍♂️

benschwarz avatar Jul 08 '22 00:07 benschwarz

Just popping up to say that a single string value mainDocumentUrl is misleading due to redirect chains. I'd kinda prefer that redirects are captured, because that tells an important story. 🚴‍♂️

heh yeah. I've recently had to use a extra verbose postRedirectRequest name to help clarify. (because what is a "redirectedUrl", the before or the after??) deep thoughts.

paulirish avatar Jul 08 '22 00:07 paulirish

If we used something like startingUrl, would that unite the two ideas?

Not really. A navigation in the middle of a user flow (e.g. clicking a link) will "start" on the page with the link, but the "requested" url will be what the link has in its href. At least that's the way I think about it.

I don't see much of a difference between initialUrl and startingUrl tbh.

adamraine avatar Jul 08 '22 01:07 adamraine

I'm not clear on what the intent behind the URLs are, but maybe a visitedUrls[] array would meet the objectives of showing the initial URL, any subsequent navigations, and the final resting URL? Easier than naming any particular URL and shows the progression of any navigations that occurred.

benschwarz avatar Jul 08 '22 05:07 benschwarz

but maybe a visitedUrls[] array would meet the objectives of showing the initial URL, any subsequent navigations, and the final resting URL?

I could see this being useful for timespan mode (e.g. determining all the visited origins for 3p stuff), however navigation URL artifact still needs to know 2 specific urls for it's audits:

  • requestedUrl: The initially requested url before any redirects or history API changes
  • Name TBD, currently mainDocumentUrl: The final url the browser performed a hard navigation to / The url of the html document

I could see us doing a URL artifact like:

interface URL {
  // Initially requested url before any redirects
  requestedUrl?: string;
  // Url of the last hard navigation, actual name TBD
  mainDocumentUrl?: string;
  // Includes initial url and all subsequent navigations and history api changes
  visitedUrls: string[];
}

adamraine avatar Jul 08 '22 17:07 adamraine

DT tests are blocked on https://chromium-review.googlesource.com/c/devtools/devtools-frontend/+/3911354

adamraine avatar Sep 21 '22 23:09 adamraine