cms
cms copied to clipboard
Slow page loading and asset metadata
Bug description
I've tried just about everything to improve the slow page loading on this site (disabling the watcher, selecting nav fields, etc) and I seem to have narrowed the problem down to the high number of assets. The assets folder on this project has exactly 14464 asset files and as a result, some of my pages take as long as 8 seconds to load. When I turn off catch_meta
the page takes 40 seconds to load.
Typically this isn't a problem with static caching but I've recently introduced Livewire on this project and every Livewire query is taking 2 to 3 seconds.
I understand that I should reduce the number of assets but this still seems like a major performance issue. Would it make any difference if I moved the assets to external storage like DO spaces or S3?
How to reproduce
You can test the Livewire page filters here as a demonstration of the slow requests: https://staging.burlingameproperties.com/properties
Logs
No response
Environment
aerni/livewire-forms: 4.0.0
jonassiewertsen/statamic-livewire: 2.9.0
mikemartin/helpscout-beacon: 1.0.2
spatie/statamic-responsive-images: 2.13.0
swiftmade/statamic-clear-assets: 1.1.0
webographen/statamic-widget-continue-editing: 1.0.1
withcandour/aardvark-seo: 2.0.28
Installation
Fresh statamic/statamic site via CLI
Antlers Parser
runtime (new)
Additional details
No response
Would it make any difference if I moved the assets to external storage like DO spaces or S3?
No, it'd likely be worse.
I've been looking into some slowness on a site I've been working on and feel like it may be a similar issue. I'm only dealing with a few thousand assets, not tens of thousands, but the number of assets is the only thing I can attribute the issue to.
This could be nothing, but I've been doing some (very unscientific) testing using the Starters Creek kit and have spotted something that may be of interest:
- I created a new site
- I tweaked the blog blueprint to have five additional asset fields
- I updated one of the entries with those new fields set
- I then replaced the
blog/show
template with one that outputs each of the five assets and records the times
With this set up the output from the template is this (in milliseconds):
Initial: /assets/blinking-carot.gif 2048
4.7008991241455
Secondary: /assets/donut.jpg 2048
1.9650459289551
Secondary: /assets/idea.jpg 2048
1.7940998077393
Secondary: /assets/octopus.jpg 2048
2.2079944610596
Secondary: /assets/pizza-wifi.jpg 2048
2.9721260070801
Average Secondary:
2.2348165512085
So the initial image evaluation takes ~5ms, then subsequent ones take ~2ms. That makes perfect sense. On the first load I guess the index hasn't been loaded yet so that takes a bit longer.
I then filled the assets folder with 10,000 additional files, cleared the cache and warmed the stache. Try again:
Initial: /assets/blinking-carot.gif 2048
40.089845657349
Secondary: /assets/donut.jpg 2048
21.219968795776
Secondary: /assets/idea.jpg 2048
13.118982315063
Secondary: /assets/octopus.jpg 2048
13.496875762939
Secondary: /assets/pizza-wifi.jpg 2048
14.589071273804
Average Secondary:
15.606224536896
Initial image is slower, which still makes sense, larger index is going to take longer to load. But the curious thing is the subsequent images. They're all much slower as well, which I wouldn't really expect.
The problem seems to be something inside the OrderedQueryBuilder
that's called in Fieldtypes\Assets::augment()
. I've not got into the guts of that to figure out what's going on, but as a quick hack to test the theory I have replaced that method with one that just goes straight to the container rather than the query builder:
public function augment($values)
{
$values = Arr::wrap($values);
$assets = collect($values)->map(fn ($value) => $this->container()->makeAsset($value));
return $this->config('max_files') === 1 ? $assets->first() : $assets;
}
Fetching the assets that way results in much faster times on the subsequent images:
Initial: /assets/blinking-carot.gif 2048
29.941082000732
Secondary: /assets/donut.jpg 2048
4.364013671875
Secondary: /assets/idea.jpg 2048
3.8208961486816
Secondary: /assets/octopus.jpg 2048
4.1608810424805
Secondary: /assets/pizza-wifi.jpg 2048
3.9041042327881
Average Secondary:
4.0624737739563
I know we're only talking milliseconds here, but I can see how this could add up with a container with tens of thousands of images and a page that outputs quite a few.
Demo Repo
Here's my testing repo: https://github.com/jacksleight/statamic-sandbox/tree/asset-testing
It's set up with the small assets collection, and you can view the timings at the /pocket
URL. I tested with the stache watcher off.
To test it with 1000s of files run:
php artisan fill
php artisan cache:clear
php please stache:warm
Thanks Jack. It's making more sense now.
Even with the all the caching / stache watcher disabled / etc the asset query is still going to be filtering arrays with tons of data. Even if they're just simple key/value arrays. 14,000 values in an array must be taking a toll.
I still have the same issue. One Plattform with about 5k assets and another with over 150k assets. It took about 45 seconds to load the entry edit form on my Plattform with 150k Assets. +- 10 seconds on the other.
Could you try again after upgrading to 3.3.44?
still 40 seconds to load an entry edit page with one asset field. Statamic Version 3.3.45
I decided to revisit the previous testing I was doing, and may have found a few tweaks that could improve things when you have a large number of assets (or entries).
These are the stats from my latest tests:
Current | With Tweaks | |
---|---|---|
10,000 assets | First Asset: 38.343 ms Avg Secondary Asset: 8.931 ms |
First Asset: 4.188 ms Avg Secondary Asset: 1.953 ms |
50,000 assets | First Asset: 165.285 ms Avg Secondary Asset: 43.837 ms |
First Asset: 14.042 ms Avg Secondary Asset: 4.803 ms |
And these don't just help with assets, the stash changes speed up entries too. In a site with 50,000 entries querying for a single entry by URL (like when a page is loaded) is ~35% faster.
These tweaks need more work and testing, and I'm sure there are some important details I've missed, but here's what I've changed so far:
-
Asset::exists()
andAsset::metaExists()
methods When an asset is augmented both of these methods get called, and they both fetch a full list of all files from the container before checking if the asset’s in that list. I noticed at least one of these calls was added as an S3 performance improvement in https://github.com/statamic/cms/pull/6822, but unfortunately it seems to make things slower for local files. Only for the first asset, but it can be significant. -
Stache\Query\Builder::getWhereColumnKeysFromStore()
and related methods Every time an index is used in a query this method takes a full copy of the items and then loops over them to prepend the store name. This adds overhead. To avoid this I’ve removed themapWithKeys
calls and have instead updatedStache\Indexes\Index
to save the indexed items with the store name already prepended to the key, so no additional processing is needed during queries. -
Stache\Query\Builder::filterWhere*()
methods All of these methods also loop over the full index to find matching items, but in some instances there are faster ways to do it. For example, withequals
andin
you can do a key intersect to get the matching items. This only works in certain situations as it requires flipping the arrays and that’s not always possible, but for things like ID and path lookups it works well. There might be ways to do similar things in the otherfilterWhere*()
methods but I’ve not looked into those yet. -
Stache\Query\EntryQueryBuilder::getKeysFromCollectionsWithWhere()
method When checking multipleand
where conditions the list of potential matching items can be pre-filtered by the previous condition's result, saving looping through the full list again for multiple columns. This works well for page lookups where theurl
is matched first and then thesite
is matched. Thesite
column items are reduced from a full list of all entries to one. I’ve not done the same in the other query builders yet.
Here’s the branch I’m using, be interested if anyone else sees an improvement with these changes: https://github.com/jacksleight/statamic-cms/tree/stache-tweaks (comparison, composer patch)
@jacksleight Thanks so much. Will report back soon with our results.
@jacksleight We're seeing at least 20% lesser loading time on each page in our tests.
data:image/s3,"s3://crabby-images/be376/be376951d1f7bd5c43b0b1f029c3417621fbe1ce" alt="233266472-a829b2ff-0a6d-4b52-907e-22550f544e01"
Seeing a good improvement with stache-tweaks branch
I also have this problem. Over 100,000 files in the storage bucket. Tried everything. Production is completely dependent on a response cache otherwise it won't load pages. Edit: After analysis we realized that 99% of the files were not associated with statamic content, so we split the single bucket into separate buckets which has worked out. And it was massive amount of files too, that took days to move.
It may not suit everyone, but theres a PR now over on eloquent driver for an asset query builder, which should resolve a lot of these problems and defer to the database for performance gains: https://github.com/statamic/eloquent-driver/pull/218