Performance problems with many albums
Lychee has serious performance problems for galleries with many albums. Here are some sample timings I took; these were all automatically generated albums without any photos in them.
1000 albums all at the same level:
Album::get: 46sAlbums::tree: 47s
1100 albums at 2 levels (100 albums at one level with 10 subalbums in each of them):
Album::get(of the 100 first-level albums): 4.6sAlbums::tree: 52s
1110 albums at 3 levels (10 albums at one level with 10 subalbums each with 10 subsubalbums each):
Album::get(of the 10 first-level albums): 0.5sAlbums::tree: 52s
I'm not sure if there's much we'll be able to do about this on the server side. The result of Albums::tree (which is used to specify a destination for Copy/Merge/Move operations, etc.) could conceivably be cached in the frontend though. Perhaps the calculation of thumbs (which is I believe what takes a lot of time) could be made optional in that case as well.
Either way, I think this is something we should test as part of the #1101 merge. I don't think that any optimizations were made there to make this process faster, so chance are, with the more complex model hierarchy it is probably (even) slower; the question is: by how much?
I was also shocked by the report in #1123 and had exactly the same thought. I will try to sample some figures this weekend.
I have some timings here between current master and the new more complex model hierarchy. Good news first, it becomes slower but not extra ordinarily much slower or basically only in the amount expected.
My setup:
- 1000 albums at top-level
- 1 single photo in each album (so that there is actually a thumbnail to generate)
Result for new model hierarchy:
- 15 sec total runtime between HTTP POST request and and response
- 3033 DB queries
SELECT * FROM albumstook 162msSELECT * FROM base_albums WHERE id IN (<1000 ids here>)took 5.72ms- 1000 times the following queries are executed (one for each photo cover):
SELECT "id", "type" FROM photos WHERE (<best rated photo in all recursive child albums of the target album)took ~600µsSELECT * FROM size_variants WHERE (<match photo_id from query before and type equals thumb>)took ~270µsSELECT * FROM sym_links WHERE (<match size_variant from query before>)took 250µs
Note: If you look at the pure DB figures the net DB time is only around 1.5s. The most time is spend in the Eloquent framework. In particular each of the queries which only return a single result creates an object of Eloquent's Collection with only one entry. Most time is actually spend for object creation, copying and destruction.
Result for current master:
- 11 sec total runtime between HTTP POST request and and response
- 1017 DB queries
SELECT * FROM albumstook 707ms (yes it is actually slower than the refactored one)- (No query for
base_albums, obviouly) - 1000 times the following query is executed (one for each photo cover)
SELECT "*" FROM photos WHERE (<best rated photo in all recursive child albums of the target album)took 1.5ms (it is slower because we select*not only "id" and "type")- (No query for
size_variants, obviouly) - No query for
sym_link, but actually there should be. Do we have a hidden bug here at current master?
Comparison
The new complex model hierarchy turns out to be 1.5 slower than the old one. That is not optimal, but also not as bad as I feared initially. Also, the numbers might actually get closer, if there were queries for the symbolic links at current master, too. Most time is lost due to 1000 (or 3000) single DB queries for the thumbnails.
Proposed solution
I don't know how to solve the problem at current master, but I have an idea for the new architecture. Here it helps that the new architecture already uses proper relations nearly everywhere. We only need to fetch the relation for the thumbs eagerly. Instead of executing 3000 independent queries, the number of queries would drop to 3 which return 1000 results each (1000 photos, 1000 size_variants, 1000 symbolic links). That should be much more faster.
However, the solution requires more than just using Album::query()->with(['thumb'])->get(). If it was so easy, I would already have done that.
The thumbnail of an albums is chosen from all browsable child photos of an album. This uses the relation HasManyPhotosRecursively under the hood. As you can see in HasManyPhotosRecursively::addEagerConstraints and HasManyPhotosRecursively::match() the code currently assumes that the number of "parent" models (i.e. the number of albums) equals one and that only all photos of a single albums needs to be fetched.
I was lazy and did not want to think about how the query must be changed such that it is able to fetch all photos recursively for more than one album. It was not needed before and so I did not improve it.
Basically, the HasManyPhotosRecursively must be improved to also support eagerly fetching all recursive child photos for more than one album and than match the result of photos properly to their albums.
Note: If you look at the pure DB figures the net DB time is only around 1.5s. The most time is spend in the Eloquent framework. In particular each of the queries which only return a single result creates an object of Eloquent's
Collectionwith only one entry. Most time is actually spend for object creation, copying and destruction.
Interesting. I saw that 1000 queries for thumbs and I thought: here's your problem. Which is why I was wondering if we should make thumb calculation optional for Albums::tree. I mean, it's nice to have them displayed alongside album titles, but it doesn't strike me as critical, especially if it comes at such a huge cost. I'm curious how much difference disabling them would make. It wouldn't solve the Album::get problem, though.
The new complex model hierarchy turns out to be 1.5 slower than the old one. That is not optimal, but also not as bad as I feared initially.
Yeah, more like 36% actually. Not bad. I feared it would be by an integer factor.
Also, the numbers might actually get closer, if there were queries for the symbolic links at current master, too.
I see them on master if I enable SL_enable (and SL_for_admin). What API call did you use?
I don't know how to solve the problem at current master, but I have an idea for the new architecture. Here it helps that the new architecture already uses proper relations nearly everywhere. We only need to fetch the relation for the thumbs eagerly. Instead of executing 3000 independent queries, the number of queries would drop to 3 which return 1000 results each (1000 photos, 1000 size_variants, 1000 symbolic links). That should be much more faster.
Right.
The more I think about it though, the more I'm convinced that the whole Albums::tree call is kind-of crazy. Why are we even fetching the whole tree? That will always be a performance bottleneck on large installations. It's fine if one has 10-20 albums but any more than that and the result becomes unwieldy to work with in the GUI, as Jacquelin pointed out. The front end really should be displaying that whole album list folded, one hierarchy level at the time, and for that we don't need Albums::tree; Album[s]::get would do just fine and the performance would be much better in most cases.
This wouldn't help with a case of 1000 (or even 100) albums at a single level, which still performs unsatisfactorily and if we can improve it, we should. But I think we can safely delay it until after #1101 is merged, as it's not a new problem.
Just a quick update. I did a fast hack to support eager loading of thumbnails (for real albums only at the moment).
With 1000 albums on the top level the number of queries goes down to 15 as expected and the runtime decreases to ~5 seconds :smiley: This is still not good, but 5 seconds are way better than 15 seconds and even faster than the master branch.
I hope to find a way to further improve this, because Laraval hydrates approx 5.500 models :see_no_evil:

App\Models\User 1
App\Models\BaseAlbumImpl 1007
App\Models\SizeVariant 2396
App\Models\Photo 1073
App\Models\Album 1007
The number of albums is correct, I have 999 "dummy" albums to test the performance and some other test albums. I haven't yet figured out why 1073 photos are hydrated, IMHO it should be 1008 (incl. the thumbnail for the "Recent" album).
The size variants are problematic. In order to create a cover for an album only the size variant no. 6 (aka "thumb") needs to be fetched from DB. But at the moment, the size variants of an photo are organized as a collection. So Photos::query()->with(['size_variants']) eagerly loads all available size variants of a photo which are at most 7 (from 0 aka "original" to 6 aka "thumb") It is not possible to only fetch a particular type of size variant.
@nagmat84 I think this one is fixed. No ?
Yes, it should be. That was the problem with the missing indices.
Sorry to bring this back to life, but i recently noticed that the index page has a substantial delay for the /api/Albums::get POST request when i'm not logged in, compared to how fast all the galleries load when i'm logged in.
Has there been a regression?
In theory no, this is mostly due to indexation of our database being out of date. I did notice it on my own installation, but after a DB optimization the problem was gone. This is mostly due to the tables using file_sort instead of indexes.
Fortunately for you I added /Optimize as an end-point which will be taking care of this re-indexing. It is already available on master or will be in the next release.