helm
helm copied to clipboard
Nextcloud initializing takes too much time
Hi,
I'm trying to setup nextcloud through the Helm chart, backed by some PVCs on shared glusterfs. Glusterfs works, as I see all of the paths (nextcloud, mariadb, redis master/replica) getting written. Thing is that the nextcloud pod's logs keeps staying in the "initializing nextcloud" part for more than 7 minutes, with the probes getting faults and restarting the pod.
Also, if I enable croinjob, it fails with something liek "wrong image" message.
Here's my values.yaml file:
## Official nextcloud image version
## ref: https://hub.docker.com/r/library/nextcloud/tags/
##
image:
repository: nextcloud
flavor: apache
# If unset, uses flavor + .Chart.AppVersion to create tag
# tag:
pullPolicy: Always
# pullSecrets:
# - myRegistrKeySecretName
nameOverride: ""
fullnameOverride: ""
podAnnotations: {}
deploymentAnnotations: {}
# Number of replicas to be deployed
replicaCount: 1
## Allowing use of ingress controllers
## ref: https://kubernetes.io/docs/concepts/services-networking/ingress/
##
ingress:
enabled: false
# className: nginx
annotations: {}
# nginx.ingress.kubernetes.io/proxy-body-size: 4G
# kubernetes.io/tls-acme: "true"
# cert-manager.io/cluster-issuer: letsencrypt-prod
# nginx.ingress.kubernetes.io/server-snippet: |-
# server_tokens off;
# proxy_hide_header X-Powered-By;
# rewrite ^/.well-known/webfinger /public.php?service=webfinger last;
# rewrite ^/.well-known/host-meta /public.php?service=host-meta last;
# rewrite ^/.well-known/host-meta.json /public.php?service=host-meta-json;
# location = /.well-known/carddav {
# return 301 $scheme://$host/remote.php/dav;
# }
# location = /.well-known/caldav {
# return 301 $scheme://$host/remote.php/dav;
# }
# location = /robots.txt {
# allow all;
# log_not_found off;
# access_log off;
# }
# location ~ ^/(?:build|tests|config|lib|3rdparty|templates|data)/ {
# deny all;
# }
# location ~ ^/(?:autotest|occ|issue|indie|db_|console) {
# deny all;
# }
# tls:
# - secretName: nextcloud-tls
# hosts:
# - nextcloud.kube.home
labels: {}
path: /
pathType: Prefix
# Allow configuration of lifecycle hooks
# ref: https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/
lifecycle: {}
# postStartCommand: []
# preStopCommand: []
phpClientHttpsFix:
enabled: false
protocol: https
nextcloud:
host: nextcloud.domain.com
username: admin
password: password
## Use an existing secret
existingSecret:
enabled: false
# secretName: nameofsecret
# usernameKey: username
# passwordKey: password
# tokenKey: serverinfo_token
# smtpUsernameKey: smtp_username
# smtpPasswordKey: smtp_password
update: 0
# If web server is not binding default port, you can define it
# containerPort: 8080
datadir: /var/www/html/data
persistence:
subPath:
mail:
enabled: true
fromAddress: nextcloud
domain: domain.com
smtp:
host: smtp.domain.com
secure: ssl
port: 587
authtype: LOGIN
name: [email protected]
password: password
# PHP Configuration files
# Will be injected in /usr/local/etc/php/conf.d for apache image and in /usr/local/etc/php-fpm.d when nginx.enabled: true
phpConfigs: {}
# Default config files
# IMPORTANT: Will be used only if you put extra configs, otherwise default will come from nextcloud itself
# Default confgurations can be found here: https://github.com/nextcloud/docker/tree/master/16.0/apache/config
defaultConfigs:
# To protect /var/www/html/config
.htaccess: true
# Redis default configuration
redis.config.php: false
# Apache configuration for rewrite urls
apache-pretty-urls.config.php: true
# Define APCu as local cache
apcu.config.php: true
# Apps directory configs
apps.config.php: true
# Used for auto configure database
autoconfig.php: false
# SMTP default configuration
smtp.config.php: true
# Extra config files created in /var/www/html/config/
# ref: https://docs.nextcloud.com/server/15/admin_manual/configuration_server/config_sample_php_parameters.html#multiple-config-php-file
configs:
redis.config.php: |-
<?php
$CONFIG = array (
'memcache.local' => '\\OC\\Memcache\\Redis',
'memcache.distributed' => '\OC\Memcache\Redis',
'memcache.locking' => '\OC\Memcache\Redis',
'redis' => array(
'host' => getenv('REDIS_HOST'),
'port' => getenv('REDIS_HOST_PORT') ?: 6379,
'password' => getenv('REDIS_HOST_PASSWORD')
)
);
custom.config.php: |-
<?php
$CONFIG = array (
'overwriteprotocol' => 'https',
'overwrite.cli.url' => 'https://drive.example.com',
'filelocking.enabled' => 'true',
'loglevel' => '2',
'enable_previews' => true,
'trusted_domains' =>
[
'nextcloud',
'drive.example.com'
]
);
# For example, to use S3 as primary storage
# ref: https://docs.nextcloud.com/server/13/admin_manual/configuration_files/primary_storage.html#simple-storage-service-s3
#
# configs:
# s3.config.php: |-
# <?php
# $CONFIG = array (
# 'objectstore' => array(
# 'class' => '\\OC\\Files\\ObjectStore\\S3',
# 'arguments' => array(
# 'bucket' => 'my-bucket',
# 'autocreate' => true,
# 'key' => 'xxx',
# 'secret' => 'xxx',
# 'region' => 'us-east-1',
# 'use_ssl' => true
# )
# )
# );
## Strategy used to replace old pods
## IMPORTANT: use with care, it is suggested to leave as that for upgrade purposes
## ref: https://kubernetes.io/docs/concepts/workloads/controllers/deployment/#strategy
strategy:
type: Recreate
# type: RollingUpdate
# rollingUpdate:
# maxSurge: 1
# maxUnavailable: 0
##
## Extra environment variables
extraEnv:
# - name: SOME_SECRET_ENV
# valueFrom:
# secretKeyRef:
# name: nextcloud
# key: secret_key
# Extra init containers that runs before pods start.
extraInitContainers: []
# - name: do-something
# image: busybox
# command: ['do', 'something']
# Extra mounts for the pods. Example shown is for connecting a legacy NFS volume
# to NextCloud pods in Kubernetes. This can then be configured in External Storage
extraVolumes:
# - name: nfs
# nfs:
# server: "10.0.0.1"
# path: "/nextcloud_data"
# readOnly: false
extraVolumeMounts:
# - name: nfs
# mountPath: "/legacy_data"
# Extra secuurityContext parameters. For example you may need to define runAsNonRoot directive
# extraSecurityContext:
# runAsUser: "33"
# runAsGroup: "33"
# runAsNonRoot: true
# readOnlyRootFilesystem: true
nginx:
## You need to set an fpm version of the image for nextcloud if you want to use nginx!
enabled: false
image:
repository: nginx
tag: alpine
pullPolicy: Always
config:
# This generates the default nginx config as per the nextcloud documentation
default: true
# custom: |-
# worker_processes 1;..
resources: {}
internalDatabase:
enabled: false
name: nextcloud
##
## External database configuration
##
externalDatabase:
enabled: false
## Supported database engines: mysql or postgresql
type: mysql
## Database host
host:
## Database user
user: nextcloud
## Database password
password:
## Database name
database: nextcloud
## Use a existing secret
existingSecret:
enabled: false
# secretName: nameofsecret
# usernameKey: username
# passwordKey: password
##
## MariaDB chart configuration
##
mariadb:
## Whether to deploy a mariadb server to satisfy the applications database requirements. To use an external database set this to false and configure the externalDatabase parameters
enabled: true
auth:
database: nextcloud
username: admin
password: password
architecture: standalone
## Enable persistence using Persistent Volume Claims
## ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
##
primary:
persistence:
enabled: true
existingClaim: nextcloud-mariadb-pvc
accessMode: ReadWriteOnce
size: 8Gi
##
## PostgreSQL chart configuration
## for more options see https://github.com/bitnami/charts/tree/master/bitnami/postgresql
##
postgresql:
enabled: false
global:
postgresql:
auth:
username: nextcloud
password: changeme
database: nextcloud
primary:
persistence:
enabled: false
# storageClass: ""
##
## Redis chart configuration
## for more options see https://github.com/bitnami/charts/tree/master/bitnami/redis
##
redis:
enabled: true
auth:
enabled: true
password: password
master:
persistence:
existingClaim: redis-master-pvc
replica:
persistence:
existingClaim: redis-replica-pvc
## Cronjob to execute Nextcloud background tasks
## ref: https://docs.nextcloud.com/server/latest/admin_manual/configuration_server/background_jobs_configuration.html#webcron
##
cronjob:
enabled: false
# Nexcloud image is used as default but only curl is needed
image: {}
#repository: nextcloud
#tag: apache
#pullPolicy: Always
# pullSecrets:
# - myRegistrKeySecretName
# Every 5 minutes
# Note: Setting this to any any other value than 5 minutes might
# cause issues with how nextcloud background jobs are executed
schedule: "*/5 * * * *"
annotations: {}
# Set curl's insecure option if you use e.g. self-signed certificates
curlInsecure: false
failedJobsHistoryLimit: 5
successfulJobsHistoryLimit: 2
# If not set, nextcloud deployment one will be set
# resources:
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
# If not set, nextcloud deployment one will be set
# nodeSelector: {}
# If not set, nextcloud deployment one will be set
# tolerations: []
# If not set, nextcloud deployment one will be set
# affinity: {}
service:
type: ClusterIP
port: 8080
loadBalancerIP: nil
nodePort: nil
## Enable persistence using Persistent Volume Claims
## ref: http://kubernetes.io/docs/user-guide/persistent-volumes/
##
persistence:
# Nextcloud Data (/var/www/html)
enabled: true
annotations: {}
## nextcloud data Persistent Volume Storage Class
## If defined, storageClassName: <storageClass>
## If set to "-", storageClassName: "", which disables dynamic provisioning
## If undefined (the default) or set to null, no storageClassName spec is
## set, choosing the default provisioner. (gp2 on AWS, standard on
## GKE, AWS & OpenStack)
##
# storageClass: "-"
## A manually managed Persistent Volume and Claim
## Requires persistence.enabled: true
## If defined, PVC must be created manually before volume will be bound
# existingClaim:
existingClaim: nextcloud-pvc
accessMode: ReadWriteMany
size: 8Gi
## Use an additional pvc for the data directory rather than a subpath of the default PVC
## Useful to store data on a different storageClass (e.g. on slower disks)
nextcloudData:
enabled: true
subPath:
annotations: {}
# storageClass: "-"
existingClaim: nextcloud-data-pvc
accessMode: ReadWriteMany
size: 5Ti
resources: {}
# We usually recommend not to specify default resources and to leave this as a conscious
# choice for the user. This also increases chances charts run on environments with little
# resources, such as Minikube. If you do want to specify resources, uncomment the following
# lines, adjust them as necessary, and remove the curly braces after 'resources:'.
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
## Liveness and readiness probe values
## Ref: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-probes
##
livenessProbe:
enabled: true
initialDelaySeconds: 500
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
successThreshold: 1
readinessProbe:
enabled: true
initialDelaySeconds: 500
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 3
successThreshold: 1
startupProbe:
enabled: false
initialDelaySeconds: 500
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 30
successThreshold: 1
## Enable pod autoscaling using HorizontalPodAutoscaler
## ref: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/
##
hpa:
enabled: true
cputhreshold: 60
minPods: 1
maxPods: 10
nodeSelector: {}
tolerations: []
affinity: {}
## Prometheus Exporter / Metrics
##
metrics:
enabled: true
replicaCount: 1
# The metrics exporter needs to know how you serve Nextcloud either http or https
https: false
# Use API token if set, otherwise fall back to password authentication
# https://github.com/xperimental/nextcloud-exporter#token-authentication
# Currently you still need to set the token manually in your nextcloud install
token: ""
timeout: 5s
image:
repository: xperimental/nextcloud-exporter
tag: 0.5.1
pullPolicy: IfNotPresent
## Metrics exporter resource requests and limits
## ref: http://kubernetes.io/docs/user-guide/compute-resources/
##
# resources: {}
## Metrics exporter pod Annotation and Labels
# podAnnotations: {}
# podLabels: {}
service:
type: ClusterIP
## Use serviceLoadBalancerIP to request a specific static IP,
## otherwise leave blank
# loadBalancerIP:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "9205"
labels:
app: nextcloud
release: kube-prometheus-stack
## Prometheus Operator ServiceMonitor configuration
##
serviceMonitor:
## @param metrics.serviceMonitor.enabled Create ServiceMonitor Resource for scraping metrics using PrometheusOperator
##
enabled: false
## @param metrics.serviceMonitor.namespace Namespace in which Prometheus is running
##
namespace: monitoring
## @param metrics.serviceMonitor.jobLabel The name of the label on the target service to use as the job name in prometheus.
##
jobLabel: nextcloud
## @param metrics.serviceMonitor.interval Interval at which metrics should be scraped
## ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#endpoint
##
interval: 30s
## @param metrics.serviceMonitor.scrapeTimeout Specify the timeout after which the scrape is ended
## ref: https://github.com/coreos/prometheus-operator/blob/master/Documentation/api.md#endpoint
##
scrapeTimeout: ""
## @param metrics.serviceMonitor.labels Extra labels for the ServiceMonitor
##
labels:
release: kube-prometheus-stack
app: nextcloud
rbac:
enabled: false
serviceaccount:
create: true
name: nextcloud-serviceaccount
annotations: {}
Finally working, with a proper DNS added and working. I had to add the DNS and a Traefik IngressRoute to use it.
Opening a new issue because now Nexcloud asks me if I'm reinstalling, and I'm actually not, right?
Hey urbaman!
Initialization takes a long time and I had similar issues.
If you still have probs, try first init with cronjob disabled and set liveness, readiness and startup probes to 'false'.
Watch the logs of the nextcloud container for init and install to finish.
Adjust your values.yml:
- Set pull policy of nextcloud to IfNotPresent
- enable cronjob
- enable the probes again
Then try to helm upgrade your installation. If that doesn't work, Helm uninstall an then install again, of course without deleting persistence.
Hope that helps.
Greetz
In my case, startupProbe.enabled is set to false by default. This causes the liveness and readiness probes to run too soon, which causes the container to be restarted before it can actually do the first time setup. The logs don't show any errors because the container is being restarted externally, not because the container itself had a problem. Thus why you just see the initializing message endlessly.
The fix for me was to set startupProbe.enabled to true, which allows the container to initialize before the readiness and liveness probes kick off and start rebooting it. If your cluster is slower, you may need to adjust the values under the startupProbe configuration to give it more time to start up.
The startup probe needs to be enabled by default and modified from using an httpGet to running a command that:
- checks for initialization step is complete, such as checking if 'rsync' is running, and wait if not finished, long time first run
- after it has verified it isn't initializing (which should be a quick check after first run), then it can run some test that makes sense for every startup (or maybe the startup probe is just that, it waits for the initialization to complete if there is one)
- maybe when nextcloud first starts up it could have a flag set somewhere that it is initializing by default, then after the initialization completes it could set that flag to initialized, and the startupProbe could just check if that flag is showing as initialized
For me the rsync step took 3 minutes 30 seconds.
The alternative would just be to keep restarting and restarting the rync over and over again till the rsync has a chance to finally complete. Setting the startup probe with an initial delay of 4 minutes works ... but we don't really want to change that for first run then go back and change it everytime afterwards.
Note: 4 minutes worked, but it wasn't enough to complete the initial installation, 4 minutes was enough for the rsync to complete but it crashed and restarted 10 more times before the readiness and liveliness probes were happy ... I'm betting the initialization wasn't complete.)
A note in the log that its performing its first time initialization and that it may take a long time, up to 5 minutes, would be good to have in the logs as admins check to see why its taking so long.
It's not apparent to me, why nextcloud copies application code in the persistent storage location:
/mnt/nfs/services/nextcloud-nextcloud/html/3rdparty/aws/aws-sdk-php
In my eyes, this is not a valid approach for containerized applications.
My current configuration:
persistence:
enabled: true
storageClass: nfs
accessMode: ReadWriteMany
I'm actually not sure if that's something we're doing in the helm chart or if this is being done by something in the docker container, which would be a nextcloud/docker issue 🤔 I haven't had a moment to check though.
That's how the docker image works. I also don't like but afaik there is no way around it due to the architecture of Nextcloud (which is not very container friendly)
The way the chart currently works, thankfully, means that using a properly immutable image is possible. I did fork the nextcloud image and build a POC image which can run and perform basic upgrades without any rsyncing: https://github.com/thefirstofthe300/nextcloud-docker
The image isn't fully tested so there's likely some pieces that are broken, but that's mostly a matter of adding things back that I may have not reworked properly after gutting the install/upgrade logic. I welcome testing of the image, but please DON'T use it on production data yet.
In the name of testing, I do have a PR open to add removing the /var/www and /var/www/html mounts: #496
The lack of rsyncing almost certainly breaks folks using Docker Compose so submitting the image upstream may not be the easiest.