wp-calypso Default fetching the site list to only visible sites

Part of https://linear.app/a8c/issue/DOTCOM-13609/reduce-e2e-timeouts-due-to-mesites

Proposed Changes

Fetching the sites list from /me/sites can take a long time depending upon the fields selected. Only requested deleted sites when they are explicitly desired should speed things up.

[!IMPORTANT]
This is just an untested end-of-friday draft. There are other places which access the endpoint which will need changing too. There might be some places which don't explicitly ask for "all sites" but need it.

This first commit is just a naive attempt to get the ball-rolling and see what breaks.

Why are these changes being made?

Fetching data from /me/sites only to filter out deleted sites, (or confusingly include them) is inefficient. It's likely better to default to only fetching not deleted sites and make requesting deleted sites an explicit decision.

Besides being a performance improvement, this should stop the flakiness of some e2e tests which use a user that has thousands of deleted sites.

Testing Instructions

First step: see if any unit tests or e2e tests break. Second step: Click through important screens comparing staging and this branch, noting where there's a difference in behaviour and if the difference is desirable. Third step: Get more opinions

Pre-merge Checklist

[ ] Has the general commit checklist been followed? (PCYsg-hS-p2)
[ ] Have you written new tests for your changes?
[ ] Have you tested the feature in Simple (P9HQHe-k8-p2), Atomic (P9HQHe-jW-p2), and self-hosted Jetpack sites (PCYsg-g6b-p2)?
[ ] Have you checked for TypeScript, React or other console errors?
[ ] Have you tested accessibility for your changes? Ensure the feature remains usable with various user agents (e.g., browsers), interfaces (e.g., keyboard navigation), and assistive technologies (e.g., screen readers) (PCYsg-S3g-p2).
[ ] Have you used memoizing on expensive computations? More info in Memoizing with create-selector and Using memoizing selectors and Our Approach to Data
[ ] Have we added the "[Status] String Freeze" label as soon as any new strings were ready for translation (p4TIVU-5Jq-p2)?
- [ ] For UI changes, have we tested the change in various languages (for example, ES, PT, FR, or DE)? The length of text and words vary significantly between languages.
[ ] For changes affecting Jetpack: Have we added the "[Status] Needs Privacy Updates" label if this pull request changes what data or activity we track or use (p4TIVU-aUh-p2)?

Jun 13 '25 17:06 dsas

Calypso Live (direct link)

https://calypso.live?image=registry.a8c.com/calypso/app:build-148696

Jetpack Cloud live (direct link)

https://calypso.live?image=registry.a8c.com/calypso/app:build-148696&env=jetpack

Automattic for Agencies live (direct link)

https://calypso.live?image=registry.a8c.com/calypso/app:build-148696&env=a8c-for-agencies

Jun 13 '25 17:06 github-actions[bot]

Here is how your PR affects size of JS and CSS bundles shipped to the user's browser:

Sections (~6 bytes removed 📉 [gzipped])

name                parsed_size           gzip_size
staging-site              -68 B  (-0.0%)       -6 B  (-0.0%)
sites-dashboard           -68 B  (-0.0%)       -6 B  (-0.0%)
site-settings             -68 B  (-0.0%)       -6 B  (-0.0%)
site-performance          -68 B  (-0.0%)       -6 B  (-0.0%)
site-monitoring           -68 B  (-0.0%)       -6 B  (-0.0%)
site-logs                 -68 B  (-0.0%)       -6 B  (-0.0%)
plans                     -68 B  (-0.0%)       -6 B  (-0.0%)
overview                  -68 B  (-0.0%)       -6 B  (-0.0%)
hosting                   -68 B  (-0.0%)       -6 B  (-0.0%)
github-deployments        -68 B  (-0.0%)       -6 B  (-0.0%)
domains                   -68 B  (-0.0%)       -6 B  (-0.0%)

Sections contain code specific for a given set of routes. Is downloaded and parsed only when a particular route is navigated to.

Legend

What is parsed and gzip size?

Parsed Size: Uncompressed size of the JS and CSS files. This much code needs to be parsed and stored in memory. Gzip Size: Compressed size of the JS and CSS files. This much data needs to be downloaded over network.

Generated by performance advisor bot at iscalypsofastyet.com.

Jun 13 '25 17:06 matticbot

I pushed https://github.com/Automattic/wp-calypso/pull/104220/commits/e0941a20149c395ddf3dec80f75ddc936203457c to filtered out the deleted sites by the endpoint 🙂

Jun 16 '25 04:06 arthur791004

This PR modifies the release build for the following Calypso Apps:

For info about this notification, see here: PCYsg-OT6-p2

notifications
odyssey-stats
wpcom-block-editor

To test WordPress.com changes, run install-plugin.sh $pluginSlug dotcom-13609-reduce-e2e-timeouts-due-to-mesites on your sandbox.

Jun 16 '25 04:06 matticbot

Thanks @arthur791004, that's amazing.

I've been testing "regular flows" this afternoon and everything I've tested seems to work as expected 🤔 I'm going to dig in tomorrow and see if there are any edge case / rarer flows / Jetpack Cloud / A4A.

Jun 16 '25 17:06 dsas

A4A only lists WoA sites, so it doesn't care about deleted sites and already doesn't show them or have a way to filter them 👍

Jun 18 '25 15:06 dsas

I feel a bit cautious, like I must be missing something, but I can't find any issues.

Jun 18 '25 15:06 dsas

Thought about it overnight and 'allSites' didn't mean 'every site' before - it excluded archived and spammed etc.

Agreed. I believe allSites should refer to all "active" or "visible" sites by default. Most users wouldn’t expect to see deleted sites included. Since the main sites list already excludes deleted sites, I think it’s reasonable to apply the same logic consistently across the app.

For example, the Select Sites modal doesn’t show hidden sites by default, so it makes sense to also hide deleted sites there.

Jun 19 '25 07:06 arthur791004

And the response now includes only 21 sites (it previously included 191 sites, so I guess I had a lot of deleted sites 😄 ).

Actually, I only have 1 deleted site. The rest of the sites that are now missing are P2 sites, so this seems to break the P2s tab:

That's weird, the P2s tab works for me and I can see that P2s are included in the requests to https://public-api.wordpress.com/rest/v1.2/me/sites?http_envelope=1&site_visibility=visible&include_domain_only=true&site_activity=active&fields=[blahblahblah]&options=[blahblahblah]

Jun 20 '25 11:06 dsas

And the response now includes only 21 sites (it previously included 191 sites, so I guess I had a lot of deleted sites 😄 ).

Actually, I only have 1 deleted site. The rest of the sites that are now missing are P2 sites, so this seems to break the P2s tab:

That's weird, the P2s tab works for me and I can see that P2s are included in the requests..

The visibility is per-user and comes from the blog_visibility usermeta. There’s a whole bunch of stuff setting that fbhepr%2Sf%3Sersf%3Qfrg_oybt_ivfvovyvgl%26cebwrpg%3Qjcpbz-og but I don’t really know what most of them are.

Nevertheless I do get 4 pages of results in production and 1 page of results on this branch, so it is a difference.

It does appear to be p2-specific - I get the same number of sites for regular sites in prod and on this branch.

I don't know how important a difference this is, but probably we can just make the p2 list do an 'all' fetch.

Jun 20 '25 12:06 dsas

@dsas What is it about these queries that makes this endpoint so slow? Can you dive into the SQL? And what's the plan since this has been reverted?

Jul 14 '25 05:07 m

@m It's the number of sites that make it slow, with the fields requested also making a difference. The average request completes in under 30ms, but for people with e.g. 50-100 sites it's taking between 3-8 seconds on average, in part depending on which fields are requested.

The endpoint works by processing each site individually: running multiple DB queries, fetching from cache, and performing some processing per site, rather than retrieving data in bulk.

Some SQL queries are heavily repeated and could be optimised or eliminated. However, even if all SQL queries were instant, there's still non-trivial time spent on processing and cache access (based on sandbox testing - this may differ in production), simply due to the volume of operations.

Several teams have already iterated on this endpoint’s performance over the years. Currently, Team Lego are investigating an alternative approach: pgz0xU-3I-p2. A change in approach may give more juice than squeezing the current one harder. If it doesn't work out, I can talk to Lego + PerfOps and squeeze some more.

Jul 14 '25 15:07 dsas