Reduce number of queries to read shares in a folder
While working on https://github.com/nextcloud/android/issues/10783 I have found that the API call to retrieve shares for files in a folder (/ocs/v2.php/apps/files_sharing/api/v1/shares?path=xyz&reshares=true&subfiles=true) executes 6 database queries for each file.
I have modified implementation of the ShareAPIController::getSharesInDir to use OC\Share20\Manager::getSharesInDir rather than retrieving shares separately for each file in the folder.
This significantly reduces the number of database queries and improves performance. In my tests (331 files in a folder) I got the following results (most of the time is now spent in the initialization code):
| _ | Execution time [ms] | # of DB queries |
|---|---|---|
| Before | 980 | 2021 |
| After | 280 | 40 |
After implementing this change, I have noticed some changes in data returned from the API.
- Permissions for room shares returned from RoomShareProvider::getSharesInFolder are incorrect. I have submitted a separate PR in nextcloud/spreed to fix the issue.
- ~Original code handling this API call also returned information about incoming group shares. I do not know if this was an intended behaviour or an error in the original implementation.~ Update - after removing the uid check for re-shares mode in https://github.com/nextcloud/server/pull/34918/commits/d0c52e56864e466f7a07cd4499d4d4831f7ba286, the new code returns the same information in this case as the original one.
I have been able to fix failing unit tests in https://github.com/nextcloud/server/pull/34918/commits/23d06eef97b9f6f6937f9c39553397a019249f27, but I am not satisfied with this approach. :disappointed: I am afraid that I do not understand the approach to the unit tests well enough to do it better. Any suggestions are welcome.
On the other hand the failing integration tests seem to point to a real issue with my update. I will look into it.
I have managed to repeat locally the first failing scenario of the integration tests.
At the moment I do not know how to fix the code. :disappointed:
The problem is that:
- For the second share expected in the result the user1 is neither the owner of the share nor the owner of the file.
- Previous code worked as expected, since in DefaultShareProvider::getSharesBy the code retrieved all the shares for the specified node (in this case a subfolder of a folder shared originally with user1).
- There is no similar mechanism in DefaultShareProvider::getSharesInFolder.
A couple of questions:
- Should this be handled in
ShareAPIController::getSharesInDiror rather inDefaultShareProvider::getSharesInFolder(and probably in other share providers as well)? - Any suggestions how to efficiently retrieve required information from the database? Would it make sense to construct the database query like below? Starting from the
oc_filecachetable, column list simplified for clarity.
SELECT
f.parent, f.fileid, f.path,
s.id, s.uid_owner, s.uid_initiator, s.file_target
FROM
oc_filecache f
INNER JOIN oc_share s ON f.fileid = s.item_source
WHERE f.parent = folder-node-id-here;
Please advise.
Update When I looked at the code in DefaultShareProvider::getSharesInFolder right after posting the original version of this comment, I realized that the method already retrieves the data in a proper way. It just seems to filter a bit too much in "reshares" mode. I have committed experimental code in https://github.com/nextcloud/server/pull/34918/commits/d0c52e56864e466f7a07cd4499d4d4831f7ba286 to verify this. It seems however that the CI queue is quite long at the moment. Will need to wait... :sleeping:
It seems that the quick-and-dirty change in https://github.com/nextcloud/server/pull/34918/commits/d0c52e56864e466f7a07cd4499d4d4831f7ba286 fixed the failing integration tests. :open_mouth:
Do you think this approach is correct?
If yes, I will clean it up and make updates for the other review comments.
I have committed small changes already requested by reviewers in https://github.com/nextcloud/server/pull/34918/commits/c51e756ab2c46b4f0ba23a73eb9c00cb7558f2f4
After removing the uid check for reshares mode in DefaultShareProvider::getSharesInFolder, the failing tests run OK. In the last run there is a failure in acceptance-app-files-sharing, but this one was OK just before clean-up, so I assume that this is a random error that I have seen already in the previous PR (#34471).
I do not plan any more changes unless there are additional suggestions from review.
I have committed small changes already requested by reviewers in c51e756
After removing the uid check for reshares mode in DefaultShareProvider::getSharesInFolder, the failing tests run OK. In the last run there is a failure in acceptance-app-files-sharing, but this one was OK just before clean-up, so I assume that this is a random error that I have seen already in the previous PR (#34471).
I do not plan any more changes unless there are additional suggestions from review.
How is the removal of uid check affecting the performance benchmark shown in your first post?
And how is this affecting the other places calling getSharesInFolder ?
@come-nc
How is the removal of uid check affecting the performance benchmark shown in your first post?
Same results as previously - within error margin. The timings vary between 260 and 290 ms. Even the query count is not always the same, I see a slightly varying number of UPDATE oc_authtoken queries during the initialization sequence.
Average results from a few runs are:
| _ | Execution time [ms] | # of DB queries |
|---|---|---|
| Before | 980 | 2021 |
| After | 275 | 40 |
And how is this affecting the other places calling
getSharesInFolder?
This is a very good question. I have removed the uid check to fix the following integration tests:
--- Failed scenarios:
/drone/src/build/integration/sharing_features/sharing-v1-part2.feature:306
/drone/src/build/integration/sharing_features/sharing-v1-part2.feature:381
/drone/src/build/integration/sharing_features/sharing-v1-part2.feature:406
This also fixed the issue that I have noticed at the beginning: original code handling this API call also returned information about incoming group shares. After removal of the uid check the new code behaves the same.
I suspect that there are some other scenarios when getSharesInFolder is going to return more data (shares in which specified user is neither owner, nor initiator). Unfortunately I am not able to assess when/how this is going to show.
Seems like the tests pass now...
Any chances to move this PR forward? It has been submitted some time ago already... Is there anything I should/could do?
To add a bit more context - the API call I am trying to optimize (retrieve shares for files in a folder) is executed frequently by the Android app - every time the user changes a folder or returns from the file view to the file list view.
I have rebased, so merging should be straightforward. Most of the tests pass, I do not see how the failing ones might be related to my changes. Or am I mistaken?
/drone/src/build/integration/features/contacts-menu.feature:91 /drone/src/build/integration/features/contacts-menu.feature:146
Hi @nickvergessen and @icewind1991,
I see a new PR #42638 touching the same area as this one - i.e. the DefaultShareProvider::getSharesInFolder method.
Could we use this as an opportunity to get back to this PR? This is pending for a long time and is quite important to improve performance of the Android app.
I will be glad to provide additional explanation and make further changes when suggested. At the moment I think that I had already responded to all questions/concerns and I have no idea what else I could do, to move this one forward.
I will bring it to the performance group and check for feedback the first week of february
@AndyScherzinger @icewind1991 - Is there anything I should/could do regarding this PR?
I see that one of the failing tests is 'block unconventional commits'. I can squash all the commits into one if this is going to help.