server icon indicating copy to clipboard operation
server copied to clipboard

[Bug]: Extremely bad performance with WebDAV

Open jose1711 opened this issue 2 years ago • 15 comments

⚠️ This issue respects the following points: ⚠️

  • [X] This is a bug, not a question or a configuration/webserver/proxy issue.
  • [X] This issue is not already reported on Github (I've searched it).
  • [X] Nextcloud Server is up to date. See Maintenance and Release Schedule for supported versions.
  • [X] I agree to follow Nextcloud's Code of Conduct.

Bug description

Mounting Nextcloud as a virtual WebDAV filesystem via fuse results in an extremely slow file access.

Steps to reproduce

  1. Run sudo mount -t davfs https://nextcloudhost/remote.php/dav/files/username mountpoint
  2. Authenticate
  3. Receive the following:
/usr/bin/mount.davfs: warning: the server does not support locks
  1. Run time ls mountpoint
  2. Get the output
________________________________________________________
Executed in   55.94 secs      fish           external
   usr time    1.89 millis  437.00 micros    1.45 millis
   sys time    0.07 millis   65.00 micros    0.00 millis

Expected behavior

Even with very modest hardware one should not really wait almost a full minute for a single directory listing (only a few items there).

Installation method

Manual installation

Operating system

Other

PHP engine version

PHP 8.0

Web server

Nginx

Database engine version

MariaDB

Is this bug present after an update or on a fresh install?

No response

Are you using the Nextcloud Server Encryption module?

Encryption is Disabled

What user-backends are you using?

  • [X] Default user-backend (database)
  • [ ] LDAP/ Active Directory
  • [ ] SSO - SAML
  • [ ] Other

Configuration report

{
    "system": {
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "debug": false,
        "logfile": "\/var\/log\/nextcloud\/nextcloud.log",
        "log_type": "nextcloud",
        "apps_paths": [
            {
                "path": "\/usr\/share\/webapps\/nextcloud\/apps",
                "url": "\/apps",
                "writable": false
            },
            {
                "path": "\/var\/lib\/nextcloud\/apps",
                "url": "\/apps-appstore",
                "writable": true
            }
        ],
        "updatechecker": false,
        "check_for_working_htaccess": false,
        "installed": true,
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "something.somedomain.com",
            "nextcloud",
            "192.168.0.119"
        ],
        "trusted_proxies": "***REMOVED SENSITIVE VALUE***",
        "overwrite.cli.url": "https:\/\/something.somedomain.com",
        "overwritewebroot": "\/",
        "overwritecondaddr": "^192\\.168\\.0\\.119$",
        "overwritehost": "something.somedomain.com",
        "overwriteprotocol": "https",
        "dbtype": "mysql",
        "version": "24.0.0.12",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "maintenance": false,
        "theme": "",
        "mail_smtpmode": "smtp",
        "mail_smtpauthtype": "LOGIN",
        "mail_sendmailmode": "smtp",
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpport": "465",
        "mail_smtpsecure": "ssl",
        "mail_smtpauth": 1,
        "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
        "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
        "app_install_overwrite": [
            "geoblocker",
            "calendar"
        ],
        "loglevel": 2,
        "profiler": false
    }
}

List of activated Apps

Enabled:
  - accessibility: 1.10.0
  - activity: 2.16.0
  - cloud_federation_api: 1.7.0
  - comments: 1.14.0
  - contactsinteraction: 1.5.0
  - dav: 1.22.0
  - federatedfilesharing: 1.14.0
  - federation: 1.14.0
  - files: 1.19.0
  - files_sharing: 1.16.2
  - files_trashbin: 1.14.0
  - files_versions: 1.17.0
  - files_videoplayer: 1.13.0
  - geoblocker: 0.5.5
  - logreader: 2.9.0
  - lookup_server_connector: 1.12.0
  - notifications: 2.12.0
  - oauth2: 1.12.0
  - photos: 1.6.0
  - provisioning_api: 1.14.0
  - settings: 1.6.0
  - sharebymail: 1.14.0
  - support: 1.7.0
  - systemtags: 1.14.0
  - tables: 0.1.2
  - text: 3.5.1
  - twofactor_backupcodes: 1.13.0
  - viewer: 1.8.0
  - workflowengine: 2.6.0

Nextcloud Signing status

Integrity checker has been disabled. Integrity cannot be verified.

Nextcloud Logs

Nothing appeared in the log during mount and `ls` invocation.

Additional info

Most of the time ls is active php-fpm process is taking 100 % CPU. Further invocations are noticeably faster (e. g. 3 seconds) thanks to caching.

Response in browser session is OK. The only noticeable performance issue is with WebDAV mounts.

OS nextcloud (nginx+PHP): Alpine Linux inside LXC container OS database (mariadb): Debian Buster inside LXC container (same host as nextcloud host)

jose1711 avatar Jun 05 '22 22:06 jose1711

please check access log and the timings of each request to see if davfs is doing too many requests or if it's Nextcloud answering too slowly

PVince81 avatar Jun 17 '22 14:06 PVince81

access.log from initial ls /mountpoint run just after mounting:

192.168.0.105 - jose [17/Jun/2022:21:16:15 +0000] "PROPFIND /remote.php/dav/files/jose/ HTTP/1.1" 207 13363 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:18 +0000] "PROPFIND /remote.php/dav/files/jose/directory1/ HTTP/1.1" 207 1152 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:21 +0000] "PROPFIND /remote.php/dav/files/jose/directory2/ HTTP/1.1" 207 3191 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:24 +0000] "PROPFIND /remote.php/dav/files/jose/directory3/ HTTP/1.1" 207 1179 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.119 - jose [17/Jun/2022:21:16:25 +0000] "PROPFIND /remote.php/dav/files/jose/ HTTP/1.0" 207 249 "-" "Mozilla/5.0 (Linux) mirall/3.5.1git (Nextcloud, arch-5.18.3-arch1-1 ClientArchitecture: x86_64 OsArchitecture: x86_64)" "192.168.0.1"
192.168.0.105 - jose [17/Jun/2022:21:16:27 +0000] "PROPFIND /remote.php/dav/files/jose/directory4/ HTTP/1.1" 207 4203 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:30 +0000] "PROPFIND /remote.php/dav/files/jose/directory5/ HTTP/1.1" 207 2650 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:33 +0000] "PROPFIND /remote.php/dav/files/jose/directory6/ HTTP/1.1" 207 1154 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:36 +0000] "PROPFIND /remote.php/dav/files/jose/directory7/ HTTP/1.1" 207 10048 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:39 +0000] "PROPFIND /remote.php/dav/files/jose/directory8/ HTTP/1.1" 207 1170 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:42 +0000] "PROPFIND /remote.php/dav/files/jose/directory9/ HTTP/1.1" 207 22976 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:45 +0000] "PROPFIND /remote.php/dav/files/jose/directory10/ HTTP/1.1" 207 74821 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:48 +0000] "PROPFIND /remote.php/dav/files/jose/directory11/ HTTP/1.1" 207 1660 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:51 +0000] "PROPFIND /remote.php/dav/files/jose/directory12/ HTTP/1.1" 207 661 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:16:54 +0000] "PROPFIND /remote.php/dav/files/jose/directory13/ HTTP/1.1" 207 5709 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.119 - jose [17/Jun/2022:21:16:55 +0000] "GET /ocs/v2.php/apps/notifications/api/v2/notifications?format=json HTTP/1.0" 304 0 "-" "Mozilla/5.0 (Linux) mirall/3.5.1git (Nextcloud, arch-5.18.3-arch1-1 ClientArchitecture: x86_64 OsArchitecture: x86_64)" "192.168.0.1"
192.168.0.119 - jose [17/Jun/2022:21:16:55 +0000] "PROPFIND /remote.php/dav/files/jose/ HTTP/1.0" 207 249 "-" "Mozilla/5.0 (Linux) mirall/3.5.1git (Nextcloud, arch-5.18.3-arch1-1 ClientArchitecture: x86_64 OsArchitecture: x86_64)" "192.168.0.1"
192.168.0.105 - jose [17/Jun/2022:21:16:57 +0000] "PROPFIND /remote.php/dav/files/jose/directory14/ HTTP/1.1" 207 11247 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:17:00 +0000] "PROPFIND /remote.php/dav/files/jose/directory15/ HTTP/1.1" 207 3798 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:17:03 +0000] "PROPFIND /remote.php/dav/files/jose/directory16/ HTTP/1.1" 207 22772 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.105 - jose [17/Jun/2022:21:17:07 +0000] "PROPFIND /remote.php/dav/files/jose/directory17/ HTTP/1.1" 207 72535 "-" "davfs2/1.6.1 neon/0.32.2" "-"
192.168.0.119 - jose [17/Jun/2022:21:17:25 +0000] "PROPFIND /remote.php/dav/files/jose/ HTTP/1.0" 207 249 "-" "Mozilla/5.0 (Linux) mirall/3.5.1git (Nextcloud, arch-5.18.3-arch1-1 ClientArchitecture: x86_64 OsArchitecture: x86_64)" "192.168.0.1"

Looks like each and every PROPFIND takes 3 seconds to run and this is run on all 17 directories in basedir. That explains almost a minute delay needed to show the initial contents. To answer the original question: it looks like NC's answers are slow.

jose1711 avatar Jun 17 '22 21:06 jose1711

I have the exact same problem with a similar setup.

If there is something you would like someone to test, I would be happy to help to narrow down the problem further.

da-h avatar Aug 14 '22 11:08 da-h

does that user have a lot of incoming shares / mount points ? those can sometimes slow down PROPFINDs because it would mount each and every one

I see you're already on 24 where this was optimized already, please try with 24.0.4 because 24.0.0 had some remaining issues related to FS setup

PVince81 avatar Aug 15 '22 08:08 PVince81

I am not sure what exactly is meant. In my case:

  • no external storages are defined
  • the main folder contains 1 shared folder/file (with ownership on my side)
  • the main folder contains 1 shared folder/file (with ownership on someone elses side)

Regarding the version I update daily, thus I am running version: 24.0.4.1, where the problem still persists.

da-h avatar Aug 20 '22 11:08 da-h

@da-h then everything should be fine, it can't be related to the shares.

please disable all third party apps in case some are messing with the regular code flow. I see you have geoblocker enabled, maybe start with this one.

otherwise the problem might be related to your setup specifically, maybe your infrastructure, as you're using nginx you might want to double check your setup: https://docs.nextcloud.com/server/latest/admin_manual/installation/nginx.html

PVince81 avatar Aug 22 '22 08:08 PVince81

Hey there (and everyone who comes across this),

thanks a lot for your suggestions!

Indeed, one of the featured apps has had an considerable impact on the performance. If I am not mistaken it was the featured "Privacy"-App in my case. (Reenabling results in a message that says it makes the server unstable).

Yet, while davfs' access is considerably faster now, the PROPFIND calls are still the limiting factor. For instance, a simple ls still takes about 3 sec.
Is there anything one can do to reduce that time?

I am also happy to look into the code if you help me finding the right place to look.

My minimal app-list to test this is:

Enabled:
  - calendar: 3.4.2
  - cloud_federation_api: 1.7.0
  - contacts: 4.2.0
  - dav: 1.22.0
  - federatedfilesharing: 1.14.0
  - files: 1.19.0
  - files_sharing: 1.16.2
  - files_trashbin: 1.14.0
  - lookup_server_connector: 1.12.0
  - notes: 4.5.0
  - notifications: 2.12.0
  - oauth2: 1.12.0
  - photos: 1.6.0
  - provisioning_api: 1.14.0
  - serverinfo: 1.14.0
  - settings: 1.6.0
  - twofactor_backupcodes: 1.13.0
  - viewer: 1.8.0
  - workflowengine: 2.6.0

More info on my setup:

> $ ./occ status
  - installed: true
  - version: 24.0.4.1
  - versionstring: 24.0.4
  - edition:
  - maintenance: false
  - needsDbUpgrade: false
  - productname: Nextcloud
  - extendedSupport: false

da-h avatar Aug 22 '22 10:08 da-h

I was able to fix the problem after finding the related issue #33453.

As mentioned there, the authtoken table was full and slowed down authentication. Clearing old and unused entries increased DAV performance significantly in my case.

Just a side note: While I also found the MR #33375 that should fix the problem, nextcloud 24.0.5 which includes that change did not solve that problem on its own, even after triggering cron manually.

da-h avatar Sep 15 '22 07:09 da-h

In my case there were only 14 matching entries in oc_authtoken table, deleting them seems to have no effect whatsoever.

jose1711 avatar Sep 15 '22 11:09 jose1711

I'm finding that the first directory listing is fast but any subsequent listing for some period of time is very slow to respond.

wallzero avatar Dec 06 '22 18:12 wallzero

+1 on this

We've tried to use Nextcloud as a storage backend for an application we're developing for an Enterprise customer. WebDAV access is a necessary integration in such solution, and we faced this terrible slowness of the WebDAV implementation in Nextcloud. I can assure you that even if we frequently empty the oc_authtoken table, the general slowness is still there. Emptying the table helps a bit, but we still get many seconds durations for even storing a single file inside Nextcloud.

Unfortunately, in the end we needed to switch to the community edition of OwnCloud, which is tremendously faster as far as WebDAV is concerned.

I strongly encourage the Nextcloud team to work on this problem, which have been there for many major releases now, and it is still currently there on the 25.0.2 version.

mristuccia avatar Dec 22 '22 16:12 mristuccia

@mristuccia we'd need more information on your setup first to find out what exactly the root cause is

as seen in this ticket the root cause can be very different, like third party apps interfering with webdav

in some case we sometimes also saw that having a Galera Cluster improperly configured could cause bottlenecks on tables like oc_authtoken and oc_filecache, depending on setups.

there are some huge setups with millions of files out there which don't suffer from Webdav slowness

PVince81 avatar Dec 23 '22 09:12 PVince81

@PVince81 it is a plain and simple installation as single docker container from the official image in Docker Hub. Last version tried is the 25.0.2, connected to a MySQL database v8.0.4, also as single container from the Docker Hub official repo. No customizations at all. No clusterized installations. WebDAV API used as per official Nextcloud documentation. Already slow after the startup from scratch, without any additional file. Tried also on Kubernetes, same plain installation, same terrible slowness. OwnCloud installed the same way, as single container, plain and simple, connected to the same DB. WebDAV is deadly fast.

mristuccia avatar Dec 23 '22 10:12 mristuccia

@J0WI can you confirm whether the docker from https://github.com/nextcloud/docker with 25.0.2 has slower Webdav or if it could be a config issue of sorts ? I didn't find a related issue in that repo.

PVince81 avatar Dec 23 '22 11:12 PVince81

slower compared to what?

J0WI avatar Dec 23 '22 21:12 J0WI

We're facing the same problem also without much oc_authtokens (300+ for 100+ users, worst times are for a user with just 6 authtokens, others with 26 are quicker).

Just a hint ... we saw that a folder shared with a group with 100+ users could take up to 10s to create a folder inside of it, which a folder shared with a group of just 6 users took 1s or so.

Could this be somehow related?

In our case, the slowest user by far (just sometimes, not always) is the admin user, used to share many folders with specific groups + everyone group.

We have all the APCu, opcache, redis locking and redis distributed memory settings, running on php 8.1, NC 25.0.3 and mariadb:10.5.15-debian-10-r30 single-noded in another box.

NC is running with apache, http2 + php-fpm on a virtual machine.

kpeiruza avatar Feb 12 '23 01:02 kpeiruza

WebDAV API used as per official Nextcloud documentation.

What WebDAV client are you using?

joshtrichards avatar Apr 23 '23 22:04 joshtrichards

Looks like each and every PROPFIND takes 3 seconds to run and this is run on all 17 directories in basedir. That explains almost a minute delay needed to show the initial contents. To answer the original question: it looks like NC's answers are slow.

@jose1711 - One thing to try to help isolate this and also an observation:

You can use cURL to perform webdav requests. Please run something like this to see how long PROPFIND's take when cURL executes them against your server (at least with the default properties -- some clients may also request additional properties from PROPFIND requiring even more data to be looked up and sent but let's stick with the basics for now):

time curl -u USERNAME:PASSWORD "http://nc-test.local:3680/remote.php/dav/files/USERNAME/SOME_FOLDER/" -X PROPFIND

This will help tell you how much of the 3s delay is on the server-side versus on the client side (e.g. a slow XML processing implementation on the client side or some other reason to delay). The number reported here will also be a tad slower than a well optimized client that has session state, but that's another story.

As for all those other PROPFIND's --- that looks like davfs2 is recursively requesting properties for all the sub-folders even when you're just trying to look at the parent folder. And, worse, it's doing those requests serially rather than in parallel. That will certainly slow down the user experience!

joshtrichards avatar Apr 24 '23 16:04 joshtrichards

Just a hint ... we saw that a folder shared with a group with 100+ users could take up to 10s to create a folder inside of it, which a folder shared with a group of just 6 users took 1s or so.

Could this be somehow related?

@kpeiruza: Is this with a davfs2 mount or something else? If so, which version?

joshtrichards avatar Apr 25 '23 19:04 joshtrichards

I spent quite some time investigating a similar issue, where a simple ls in my remote folder would take about 0.5 sec per http request by davfs2. I also got this log for each request:

Sabre\DAV\Exception\NotAuthenticated: No  public access to this resource., No 'Authorization: Basic' header  found. Either the client didn't send one, or the server is  misconfigured, No 'Authorization: Bearer' header found. Either the  client didn't send one, or the server is mis-configured, No  'Authorization: Basic' header found. Either the client didn't send one,  or the server is misconfigured

I found out I had set my password instead of an app password in ~/.davfs2/secrets. Now the log is gone and the time for my ls is very reasonable.

theof avatar May 15 '23 16:05 theof

Performance with basic auth is always very bad since it needs to do hashing. App password can be directly compared and don't have this performance penalty.

provokateurin avatar May 15 '23 16:05 provokateurin

I do not know whether it is the same issue for others in this thread. I imagine it is a common misconfiguration for people skimming through the docs. So maybe we could emphasize on the importance of using an app password in the docs (making the blue note red) or if easily feasible notifying users using basic auth together with non web-browser clients ?

theof avatar May 15 '23 16:05 theof

I found out I had set my password instead of an app password in ~/.davfs2/secrets. Now the log is gone and the time for my ls is very reasonable.

Thank you @theof for sharing this observation! Here's a comparison of execution duration (seconds) between user vs application password on my system.

Command User password Application password
mount 9.3 3.28
ls (1st) 86.23 7.58
ls (2nd - cached) 29.4 1.94
ls (3rd - cached) 32.89 1.99
ls (4th - cached) 33.51 2

It's unbelievable what difference this tiny change makes.

I do not know whether it is the same issue for others in this thread. I imagine it is a common misconfiguration for people skimming through the docs.

This may be an issue only if the NC is running on a hardware with weak CPU as I have been using other NC instance w/o application password just fine.

jose1711 avatar May 22 '23 07:05 jose1711

I wonder if the performance is so bad because the hashing happens many times because of recursive listing? @jose1711 maybe you could also compare listing folders with different levels of depth?

provokateurin avatar May 22 '23 07:05 provokateurin

I wonder if the performance is so bad because the hashing happens many times because of recursive listing? @jose1711 maybe you could also compare listing folders with different levels of depth?

Note that ls listing was done on the top level (mountpoint). I can imagine that a recursive listing would take a lot more time to process.

jose1711 avatar May 22 '23 11:05 jose1711