docker Very slow upgrade process

Issue

When trying to upgrade the nextcloud version, i saw that the upgrade was stuck for ~30m before continuing to process. (it was already happening before but only stuck for 5m)

Currently, i have no idea what is causing this issue, so if someone already have the same issue, i could try to change config, and else, if someone could provide some commands i could run to see what could cause the issue on a next upgrade, that would be nice.

Files / Logs

Here is the init logs : (check the time diff between line 6 and 7)

2025-06-15T11:36:13.887214965Z Conf remoteip disabled.
2025-06-15T11:36:13.887277700Z To activate the new configuration, you need to run:
2025-06-15T11:36:13.887292103Z   service apache2 reload
2025-06-15T11:36:13.897353629Z Configuring Redis as session handler
2025-06-15T11:36:14.313096226Z Initializing nextcloud 31.0.6.2 ...
2025-06-15T11:36:14.313114123Z Upgrading nextcloud from 31.0.5.1 ...
2025-06-15T12:01:09.655694259Z => Searching for hook scripts (*.sh) to run, located in the folder "/docker-entrypoint-hooks.d/pre-upgrade"
2025-06-15T12:01:09.659895462Z ==> Skipped: the "pre-upgrade" folder is empty (or does not exist)
2025-06-15T12:01:33.358814697Z Nextcloud or one of the apps require upgrade - only a limited number of commands are available
2025-06-15T12:01:33.358851309Z You may use your browser or the occ upgrade command to do the upgrade
2025-06-15T12:01:33.767487087Z Setting log level to debug
2025-06-15T12:01:34.520048300Z Turned on maintenance mode
2025-06-15T12:01:38.139143538Z Updating database schema
2025-06-15T12:01:38.247622306Z Updated database
2025-06-15T12:01:55.975508497Z Starting code integrity check...
2025-06-15T12:04:32.869696894Z Finished code integrity check
2025-06-15T12:04:33.119839658Z Update successful
2025-06-15T12:04:33.261717198Z Turned off maintenance mode
2025-06-15T12:04:33.262340213Z Resetting log level
2025-06-15T12:04:35.705705010Z The following apps have been disabled:
2025-06-15T12:04:35.707650208Z => Searching for hook scripts (*.sh) to run, located in the folder "/docker-entrypoint-hooks.d/post-upgrade"
2025-06-15T12:04:35.708894322Z ==> Skipped: the "post-upgrade" folder is empty (or does not exist)
2025-06-15T12:04:35.708922769Z Initializing finished

compose/stack (i'm using docker swarm) :

services:
  db:
    image: mariadb:10.6
    restart: always
    command: --transaction-isolation=READ-COMMITTED --log-bin=binlog --binlog-format=ROW
    volumes:
      - db1:/var/lib/mysql
    environment:
      - MARIADB_RANDOM_ROOT_PASSWORD=1
      - MYSQL_PASSWORD=xx
      - MYSQL_DATABASE=xx
      - MYSQL_USER=xx
    deploy:
      replicas: 1
      placement:
        constraints: [node.labels.worker == true]
      resources:
        limits:
          cpus: '0.50'
          memory: 250M
        reservations:
          memory: 100M

  redis:
    image: redis
    restart: always
    volumes:
      - redis1:/var/lib/redis/data
    command: redis-server --requirepass xx
    deploy:
      replicas: 1
      placement:
        constraints: [node.labels.worker == true]
      resources:
        limits:
          cpus: '0.50'
          memory: 250M
        reservations:
          memory: 100M

  app:
    image: nextcloud
    restart: always
    networks:
      - default
      - traefik
      - ldap
    links:
      - db
      - redis
    volumes:
      - nextcloud1:/var/www/html
    environment:
      - MYSQL_PASSWORD=xx
      - MYSQL_DATABASE=xx
      - MYSQL_USER=xx
      - MYSQL_HOST=db
      - REDIS_HOST=redis
      - REDIS_HOST_PASSWORD=xx
      - APACHE_DISABLE_REWRITE_IP=1
      - PHP_MEMORY_LIMIT=2048M
    deploy:
      replicas: 1
      placement:
        constraints: [node.labels.worker == true]
      labels:
        - traefik.enable=true
        - .... others traefik labels
      
  cron:
    image: nextcloud
    restart: always
    networks:
      - default
      - ldap
    links:
      - db
      - redis
    volumes:
      - nextcloud1:/var/www/html
    entrypoint: /cron.sh
    environment:
      - MYSQL_PASSWORD=xx
      - MYSQL_DATABASE=xx
      - MYSQL_USER=xx
      - MYSQL_HOST=db
      - REDIS_HOST=redis
      - REDIS_HOST_PASSWORD=xx
    deploy:
      replicas: 1
      placement:
        constraints: [node.labels.worker == true]

networks:
  default:
  traefik:
    external: true
  ldap:
    external: true

# omit volumes, not really relevent as they are connected and working

server config :

{
    "system": {
        "default_phone_region": "FR",
        "htaccess.RewriteBase": "\/",
        "memcache.local": "\\OC\\Memcache\\APCu",
        "apps_paths": [
            {
                "path": "\/var\/www\/html\/apps",
                "url": "\/apps",
                "writable": false
            },
            {
                "path": "\/var\/www\/html\/custom_apps",
                "url": "\/custom_apps",
                "writable": true
            }
        ],
        "upgrade.disable-web": true,
        "instanceid": "***REMOVED SENSITIVE VALUE***",
        "passwordsalt": "***REMOVED SENSITIVE VALUE***",
        "secret": "***REMOVED SENSITIVE VALUE***",
        "trusted_domains": [
            "cloud.<domain>",
            "cloud.<server1>.<domain>",
            "cloud.<server2>.<domain>",
            "cloud.<server3>.<domain>"
        ],
        "datadirectory": "***REMOVED SENSITIVE VALUE***",
        "dbtype": "mysql",
        "version": "31.0.6.2",
        "overwrite.cli.url": "https:\/\/cloud.<domain>",
        "trusted_proxies": "***REMOVED SENSITIVE VALUE***",
        "forwarded_for_headers": [
            "HTTP_X_FORWARDED_FOR"
        ],
        "overwriteprotocol": "https",
        "dbname": "***REMOVED SENSITIVE VALUE***",
        "dbhost": "***REMOVED SENSITIVE VALUE***",
        "dbport": "",
        "dbtableprefix": "oc_",
        "mysql.utf8mb4": true,
        "dbuser": "***REMOVED SENSITIVE VALUE***",
        "dbpassword": "***REMOVED SENSITIVE VALUE***",
        "installed": true,
        "ldapProviderFactory": "OCA\\User_LDAP\\LDAPProviderFactory",
        "defaultapp": "files,calendar",
        "maintenance": false,
        "filelocking.enabled": true,
        "memcache.locking": "\\OC\\Memcache\\Redis",
        "redis": {
            "host": "***REMOVED SENSITIVE VALUE***",
            "password": "***REMOVED SENSITIVE VALUE***",
            "port": 6379
        },
        "memcache.distributed": "\\OC\\Memcache\\Redis",
        "mail_from_address": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpmode": "smtp",
        "mail_sendmailmode": "smtp",
        "mail_domain": "***REMOVED SENSITIVE VALUE***",
        "mail_smtphost": "***REMOVED SENSITIVE VALUE***",
        "mail_smtpsecure": "ssl",
        "mail_smtpport": "465",
        "mail_smtpauth": 1,
        "mail_smtpname": "***REMOVED SENSITIVE VALUE***",
        "mail_smtppassword": "***REMOVED SENSITIVE VALUE***",
        "maintenance_window_start": 2,
        "app_install_overwrite": [
            "google_synchronization"
        ],
        "loglevel": 1
    }
}

Jun 15 '25 12:06 ibaraki-douji

Hello. Can you test outside docker-swarm? Never had this problem using compose standalone.

Jul 08 '25 17:07 henmohr

@henmohr it's not swarm the issue, i also had it with compose tho not for a full 30m.

there is two possible causes (and i think it's both) :

I'm using distributed storage (CephFS), which is slower than nvme (because network + all hosts don't have nvme but ssd)
After another upgrade i saw that some rsync processes where present when it was stuck so it might be related to ( #1904 )

Jul 09 '25 15:07 ibaraki-douji

@henmohr and @ibaraki-douji

My suggested changes which might help

-> Pre-seed key folders only via Docker volumes:

Mount only critical paths and skip syncing major blocks of files:

volumes:

nextcloud_config:/var/www/html/config
nextcloud_data:/var/www/html/data
nextcloud_custom_apps:/var/www/html/custom_apps
nextcloud_themes:/var/www/html/themes

-> Temporarily switch /var/www/html to local SSD/NVMe during upgrades:

    - Use a local fast volume for /var/www/html to drastically speed up rsync, then revert to CephFS afterward.

->Optionally build a custom Nextcloud image that avoids rsync entirely:

     - Embed application files into the image and mount only dynamic directories. This avoids expensive syncing overhead each restart. Inspired by discussions in the Docker repo issue

Aug 19 '25 05:08 Abhicodeitout