server icon indicating copy to clipboard operation
server copied to clipboard

[Bug]: Server is really slow and unresponsive after upgrade from 29 to 30.0.2, No error in the logs

Open le-patenteux opened this issue 1 year ago • 10 comments

⚠️ This issue respects the following points: ⚠️

Bug description

After updating to NC 30.0.2, from NC 29, the server has abysmal performance! This server has always been lightning fast before this upgrade. Logs are not reporting any issue.

The clients are getting "504 Gateway Time out" errors after uploading 3-4 files, every time I try to sync them, the web interface is slow, hangs for up to 30 seconds sometimes

There are no indication of issues anywhere on the server.

This issue has been reported on reddit by multiple users

Hosted on Oracle ARM instance on Ubuntu 22.04 The server is a docker container, behind an NGINX proxy 4 CPU, 24GB of RAM

This server was super fast this morning, and has terrible performance now that I have upgraded. No way to revert the "upgrade"...

Steps to reproduce

  1. Have Nextcloud 29.x.x and be happy with performance
  2. Upgrade to NC 30.0.2
  3. Feel the pain!

Expected behavior

Server should run as fast as before... Clients should be able to sync without getting timeout errors

Nextcloud Server version

30

Operating system

Debian/Ubuntu

PHP engine version

PHP 8.3

Web server

Nginx

Database engine version

PostgreSQL

Is this bug present after an update or on a fresh install?

Upgraded to a MAJOR version (ex. 28 to 29)

Are you using the Nextcloud Server Encryption module?

None

What user-backends are you using?

  • [x] Default user-backend (database)
  • [ ] LDAP/ Active Directory
  • [ ] SSO - SAML
  • [ ] Other

Configuration report

occ config:list system
occ config:list system
occ config:list system:
{
    "system": {
        "htaccess.RewriteBase": "\/",
        "memcache.local": "\\OC\\Memcache\\APCu",
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_proxies": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "cloud.glmconseil.com",
            "onlyoffice.glmconseil.com"
        ],
        "enabledPreviewProviders": [
            "OC\\Preview\\Image",
            "OC\\Preview\\HEIC",
            "OC\\Preview\\TIFF",
            "OC\\Preview\\Movie"
        ],
        "trashbin_retention_obligation": "auto, 90",
        "versions_retention_obligation": "auto, 90",
        "dbtype": "pgsql",
        "version": "30.0.2.2",
        "overwrite.cli.url": "http:\/\/localhost",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "5432",
        "dbtableprefix": "oc_",
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "installed": true,
        "maintenance": false,
        "mail_smtpmode": "smtp",
        "mail_sendmailmode": "smtp",
        "mail_smtpport": "587",
        "mail_smtptimeout": "30",
        "loglevel": 0,
        "default_locale": "fr_CA",
        "theme": "",
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpauth": 1,
        "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
        "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpauthtype": "LOGIN",
        "app_install_overwrite": [
            "occweb",
            "files_rightclick",
            "music",
            "mindmap_app",
            "files_mindmap",
            "video_converter"
        ],
        "updater.release.channel": "stable",
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "filelocking.enabled": "true",
        "memcache.locking": "\\OC\\Memcache\\APCu",
        "upgrade.disable-web": true,
        "twofactor_enforced": "true",
        "twofactor_enforced_groups": [],
        "twofactor_enforced_excluded_groups": [],
        "ldapProviderFactory": "OCA\\User_LDAP\\LDAPProviderFactory",
        "maintenance_window_start": 1
    }
}

List of activated Apps

Enabled:
  - activity: 3.0.0
  - admin_audit: 1.20.0
  - analytics: 5.1.0
  - app_api: 4.0.0
  - bruteforcesettings: 3.0.0
  - calendar: 5.0.6
  - circles: 30.0.0
  - cloud_federation_api: 1.13.0
  - collectives: 2.15.1
  - comments: 1.20.1
  - contacts: 6.1.1
  - contactsinteraction: 1.11.0
  - dashboard: 7.10.0
  - dav: 1.31.1
  - deck: 1.14.2
  - drawio: 3.0.3
  - external: 5.5.2
  - federatedfilesharing: 1.20.0
  - federation: 1.20.0
  - files: 2.2.0
  - files_accesscontrol: 1.20.1
  - files_automatedtagging: 1.20.0
  - files_downloadlimit: 3.0.0
  - files_external: 1.22.0
  - files_mindmap: 0.0.30
  - files_pdfviewer: 3.0.0
  - files_reminders: 1.3.0
  - files_sharing: 1.22.0
  - files_trashbin: 1.20.1
  - files_versions: 1.23.0
  - firstrunwizard: 3.0.0
  - flow_notifications: 1.10.0
  - forms: 4.3.4
  - gpoddersync: 3.10.0
  - groupfolders: 18.0.6
  - integration_excalidraw: 2.2.0
  - logreader: 3.0.0
  - lookup_server_connector: 1.18.0
  - mail: 4.0.5
  - music: 2.0.1
  - nextcloud_announcements: 2.0.0
  - notifications: 3.0.0
  - oauth2: 1.18.1
  - onlyoffice: 9.5.0
  - password_policy: 2.0.0
  - photos: 3.0.2
  - privacy: 2.0.0
  - provisioning_api: 1.20.0
  - recommendations: 3.0.0
  - related_resources: 1.5.0
  - serverinfo: 2.0.0
  - settings: 1.13.0
  - sharebymail: 1.20.0
  - sharepoint: 1.18.0
  - sociallogin: 5.7.0
  - spreed: 20.0.2
  - support: 2.0.0
  - survey_client: 2.0.0
  - suspicious_login: 8.0.0
  - systemtags: 1.20.0
  - text: 4.1.0
  - theming: 2.5.0
  - theming_customcss: 1.17.0
  - twofactor_backupcodes: 1.19.0
  - twofactor_nextcloud_notification: 4.0.0
  - twofactor_totp: 12.0.0-dev
  - updatenotification: 1.20.0
  - user_ldap: 1.21.0
  - user_status: 1.10.0
  - viewer: 3.0.0
  - weather_status: 1.10.0
  - webhook_listeners: 1.1.0-dev
  - workflow_pdf_converter: 1.15.0
  - workflow_script: 1.15.0
  - workflowengine: 2.12.0
Disabled:
  - cms_pico: 1.0.21 (installed 1.0.21)
  - encryption: 2.18.0 (installed 2.17.0)
  - notes: 4.11.0 (installed 4.11.0)

Nextcloud Signing status

Technical information
=====================
The following list covers which files have failed the integrity check. Please read
the previous linked documentation to learn more about the errors and how to fix
them.

Results
=======
- mail
	- EXCEPTION
		- OC\IntegrityCheck\Exceptions\InvalidSignatureException
		- Certificate is not valid.

Raw output
==========
Array
(
    [mail] => Array
        (
            [EXCEPTION] => Array
                (
                    [class] => OC\IntegrityCheck\Exceptions\InvalidSignatureException
                    [message] => Certificate is not valid.
                )

        )

Nextcloud Logs

Will post later, can't access

Additional info

No response

le-patenteux avatar Nov 28 '24 05:11 le-patenteux

The clients are getting "504 Gateway Time out" errors after uploading 3-4 files, every time I try to sync them, the web interface is slow, hangs for up to 30 seconds sometimes

There are no indication of issues anywhere on the server.

Please check your reverse proxy error log + web server error log for clues.

Nextcloud Logs

Will post later, can't access

Because of the reported problem? If so, just check your data/nextcloud.log directly.

joshtrichards avatar Dec 01 '24 20:12 joshtrichards

I have similar problems here after upgrading to 30.0.2. Setup is similar, nextcloud docker behind nginx reverse proxy. No errors on nginx logs. Uploads of files are slowing down nextcloud significantly also for user users. "Bigger" files of a couple of MByte getting completely stuck during upload. If canceling upload there is an "Unknown error during upload" message. No corresponding error in the nextcloud. But the files are still getting visible for the user then in nextcloud.

Opening/viewing a 12MB pdf file in nextcloud now takes around 90 seconds now. During upload and download there is almost zero server load, likewise memory (2GB of 32GB RAM used)

Network connection on the server is fast: Testing download speed................................................................................ Download: 296.05 Mbit/s Testing upload speed...................................................................................................... Upload: 360.14 Mbit/s

I did already occ files:scan-app-data and also cleanup. I do not see issues anywhere which could lead to these symptoms. In 30.0.1 there were not any performance issues.

tjareson avatar Dec 02 '24 08:12 tjareson

Quick update: when I connect to the nextcloud server directly in the local network, without the nginx reverse proxy inbetween, the problems are gone. Does 30.0.2 have any changed requirements regarding reverse proxy configuration, I'm not aware off? I didn't touch the nginx proxy config since ages actually. In this configuration the reverse proxy sits on a different machine and nextcloud is connected through an reverse ssh, which all worked without issues until nextcloud 30.0.2

fyi this is my current nginx config

map $remote_addr $log_ip {
    
    "xxxxx" 0; # Placeholder for real IP address
    default 1;

}

server {
    server_name xxxxx; # Placeholder for actual domain
    client_max_body_size 512M;
    location / {

      proxy_set_header        Host $host;
      proxy_buffering off;
      proxy_http_version 1.1;
      proxy_set_header        X-Real-IP $remote_addr;
      proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
      proxy_set_header        X-Forwarded-Proto $scheme;
      proxy_set_header Upgrade $http_upgrade;
      proxy_set_header Connection $http_connection;
      proxy_read_timeout 3600;
      add_header Front-End-Https on;
      client_max_body_size 512M;
      add_header Strict-Transport-Security "max-age=15768000; includeSubDomains; preload;";
      proxy_pass http://127.0.0.1:8889/; # Backend server address
      access_log /var/log/nginx/access.log combined if=$log_ip;
    }

    listen 443 ssl; # managed by Certbot
    ssl_certificate /etc/letsencrypt/live/xxxxx/fullchain.pem; # Placeholder for certificate path
    ssl_certificate_key /etc/letsencrypt/live/xxxxx/privkey.pem; # Placeholder for key path
    include /etc/letsencrypt/options-ssl-nginx.conf; # managed by Certbot
    ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; # managed by Certbot
    add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;    
    add_header X-Content-Type-Options "nosniff" always;
    add_header X-Frame-Options "SAMEORIGIN" always;
    add_header X-XSS-Protection "1; mode=block" always;
}

server {
    client_max_body_size 512M;
    if ($host = xxxxx) { # Placeholder for domain name
        return 301 https://$host$request_uri;
    } # managed by Certbot

    server_name xxxxx; # Placeholder for domain name
    listen 80;
    return 404; # managed by Certbot
}

tjareson avatar Dec 02 '24 10:12 tjareson

Changing to http2 on nginx reverse proxy solved the performance issues fpr up and download after upgrading to Nextcloud 30.0.2 for me:

listen 443 ssl http2;

tjareson avatar Dec 02 '24 11:12 tjareson

Changing to http2 on nginx reverse proxy solved the performance issues fpr up and download after upgrading to Nextcloud 30.0.2 for me:

listen 443 ssl http2;

You are sure that is what fixed it? I have http2 enabled already and it changed nothing... By the way, this method is deprecated with nginx... this is how it should be written:

listen 443 ssl;
http2 on;

le-patenteux avatar Dec 02 '24 12:12 le-patenteux

You are sure that is what fixed it? I have http2 enabled already and it changed nothing.

Well, the performance issues I saw, were: If I uploaded a file let's say bigger then 3-4MB Nextcloud got almost suspended for other users. Likewise opening pdf-files of these files took 1-2 minutes. The same effect on the nextcloud app. All of that without very specific error msg anywhere.

So I tested local access to check where the performance bottleneck is, as neither the nginx machine, nor the machine where nextcloud runs showed any processor load or memory shortage. And the local test was super fast, so something inbetween nextcloud and the browser caused the issue. So the only change I've applied to my nginx config was to enable http2. After that everything was fast again.

I'm a bit baffled myself that a protocol update to http2 causes such an effect. I guess in the end it is not the root cause, but a workaround. That problem is a bit annoying, as it renders Nextcloud almost useless. Second time I had such surprises after an update.

tjareson avatar Dec 02 '24 13:12 tjareson

I have a very slow instance and cant find anything in any logs, but after watching MySQL with page loads I have found long running query's.

SELECT `filecache`.`fileid`, `storage`, `path`, `path_hash`, `filecache`.`parent`, `filecache`.`name`, `mimetype`, `mimepart`, `size`, `mtime`, `storage_mtime`, `encrypted`, `filecache`.`etag`, `filecache`.`permissions`, `checksum`, `unencrypted_size`, `metadata_etag`, `creation_time`, `upload_time`, `meta`.`json` AS `meta_json`, `meta`.`sync_token` AS `meta_sync_token` FROM `oc_filecache` `filecache` LEFT JOIN `oc_filecache_extended` `fe` ON `filecache`.`fileid` = `fe`.`fileid` LEFT JOIN `oc_files_metadata` `meta` ON `filecache`.`fileid` = `meta`.`file_id` WHERE (`path_hash` = 'f0d8cd98237580338f8e032147d2e04b') AND (`storage` = 59);

This kind of query shows up with every page load and gets slower as the oc_filecache table grows, I did a rescan and also a cleanup.

I then deleted storage 59 from oc_storages and did a file:scan, that cleared a load of stuff out and page loads looked normal. But over the course of the day, things are back to slow

mhzawadi avatar Mar 21 '25 22:03 mhzawadi

Could be related to #47901

Adding this has helped a lot

ALTER TABLE `oc_filecache` ADD INDEX `fs_storage_path_hash` (`path_hash`) USING BTREE;

mhzawadi avatar Mar 30 '25 09:03 mhzawadi

@le-patenteux This report is fairly old at this point. What's your take on it's continued relevance or should we close this out?

joshtrichards avatar Dec 07 '25 16:12 joshtrichards

I don't even remember where I found the solution or what it was, because at that point I was trying many communication channels, but in the end, it got resolved. I do remember it was an issue in that version specifically.

You can mark as resolved, and change it to a conversation, and if I can find what I did back then, I will publish it for the good of others. Sorry for that, I normally keep people up-to-date when I find the solution to an issue.

le-patenteux avatar Dec 09 '25 14:12 le-patenteux