lighthouse icon indicating copy to clipboard operation
lighthouse copied to clipboard

Timespan Third Party Filtering

Open adamraine opened this issue 3 years ago • 2 comments

In our report we use the origin of finalUrl/finalDisplayedUrl to determine what is third party.

In timespan mode this origin can change so some URLs might get incorrectly categorized as third party or first party.

We should either design a better way to determine third party requests, or disable the third party filtering in timespan.

adamraine avatar Aug 08 '22 21:08 adamraine

The obvious solution is to consider a URL to be first party if any of the URLs match in origin, and third party otherwise. Any downsides there?

connorjclark avatar Aug 09 '22 02:08 connorjclark

any of the URLs

We would need a visitedUrls list or something.

Any downsides there?

During a timespan, if the usr goes from originA to originB, and originB requests some resources under originA, are those resources considered third party? From originB's perspective they are but from the users perspective they might not be.

It's would be better than what we have now though.

adamraine avatar Aug 11 '22 17:08 adamraine

I just ran into this issue while implementing entity-classification into third-party-summary audit. The issue I'm seeing with timespan is that it allows cross-origin navigations. Eg., I could do a timespan of navigating to paulirish.com and then click on the Twitter link on the right rail, navigating to twitter.com origin. In this case, a report is still generated, but third-party and first-party doesn't make much sense in that context.

We could either, (1) restrict timespan to same origin navigations, or (2) treat each navigation as independent (navigations[]) without squishing everything as one trace/devtoolsLogs/score, i.e., giving each navigation its own first party and a set of third parties.

The more I think about it, the more I'm leaning towards option 1.

alexnj avatar Dec 12 '22 22:12 alexnj

Alternatively, we could implicitly create multiple timespans each time origin changes — so each timespan would still make sense and the overall report still is usable. This should also work with Flows better, and we're neither restricting timespans to an origin, nor abruptly ending a timespan when a new origin navigation is detected.

alexnj avatar Dec 13 '22 20:12 alexnj

Dupe of https://github.com/GoogleChrome/lighthouse/issues/14775

paulirish avatar Mar 13 '23 22:03 paulirish