feat: improve deployment modes and persistence validation
This PR improves the validation and documentation of deployment modes in the n8n Helm chart.
Key changes:
- Add validation to prevent invalid configurations:
- Deployment with PVC cannot have multiple replicas
- Webhook with PVC cannot have multiple replicas
- Update documentation with detailed deployment modes:
- Main component: Deployment vs StatefulSet
- Worker component: Both modes supported
- Webhook component: Deployment with persistence limitations
- Add examples for different deployment scenarios
- Document Enterprise license requirements
- Document persistence limitations for all components
Note: This PR fixes potential issues with data loss and improves user experience by providing clear error messages and documentation.
Summary by CodeRabbit
-
New Features
- Support for deploying main and worker as StatefulSets and independent scaling of webhook/worker components; per-component persistence options.
-
Documentation
- Expanded README with deployment modes, HA guidance, migration steps, supplemental manifests, and Valkey config.
-
Bug Fixes
- Validation added to prevent unsupported persistence/replica combinations.
-
Tests
- Chart lint/test updated to validate StatefulSet rendering.
[!NOTE]
Other AI code review bot(s) detected
CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.
Walkthrough
Introduces component-specific StatefulSet support and PVCs for main/worker/webhook, adds validation for replica/persistence combinations, refactors Helm values into main/worker/webhook, adds README deployment/migration guidance, and updates chart version and CI linting to validate StatefulSet rendering. (49 words)
Changes
| Cohort / File(s) | Change Summary |
|---|---|
DocumentationREADME.md |
Expanded README with Deployment Modes, High Availability, migration guide, extra manifests, Valkey notes, licensing and YAML examples; added breaking changes banner. |
Chart metadatacharts/n8n/Chart.yaml |
Bumped chart version to 1.1.0 and appVersion update. |
Helpers & validationcharts/n8n/templates/_helpers.tpl |
Added n8n.webhook.pvc, n8n.worker.pvc, n8n.validateValues; updated n8n.pvc name to append -main; removed old Valkey validation. |
PVCscharts/n8n/templates/pvc.yaml |
Added top-level n8n.validateValues; main PVC renamed -main; added -webhook and -worker PVCs with per-component persistence logic. |
StatefulSet templatescharts/n8n/templates/statefulset.yaml, charts/n8n/templates/statefulset.worker.yaml |
New StatefulSet manifests for main and worker, gated by .Values.*.statefulSet.enabled, with volumeClaimTemplates and full Helm value support. |
Deployments (conditional)charts/n8n/templates/deployment.yaml, charts/n8n/templates/deployment.worker.yaml, charts/n8n/templates/deployment.webhook.yaml |
Deployments now conditional when corresponding statefulSet.enabled is false; added n8n.validateValues calls; moved/adjusted securityContext and conditional volume mounts/PVC usage to component-specific helpers; pod annotation hashing scoped to component values. |
HPAcharts/n8n/templates/hpa.yaml |
Added n8n.validateValues inclusion before HPA resource rendering. |
Valuescharts/n8n/values.yaml |
Added statefulSet.enabled flags under main and worker; minor commented worker env additions. |
CI workflow.github/workflows/lint-test.yaml |
Downgraded chart-testing action version and added conditional lint step that renders chart with statefulset mode and verifies kind: StatefulSet. |
Sequence Diagram(s)
sequenceDiagram
participant User
participant Helm
participant Validator as n8n.validateValues
participant Kubernetes
User->>Helm: helm install/upgrade (values.yaml)
Helm->>Validator: run validation checks
alt validation fails
Validator-->>Helm: error -> abort render
else validation succeeds
Helm->>Kubernetes: render templates (Deployment/StatefulSet, PVCs, HPA)
alt main.statefulSet.enabled
Kubernetes->>Kubernetes: create StatefulSet (main) + PVC (-main)
else
Kubernetes->>Kubernetes: create Deployment (main) + PVC/emptyDir
end
alt worker.statefulSet.enabled
Kubernetes->>Kubernetes: create StatefulSet (worker) + PVC (-worker)
else
Kubernetes->>Kubernetes: create Deployment (worker) + PVC/emptyDir
end
Kubernetes->>Kubernetes: create Webhook Deployment (+/- webhook PVC)
end
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Possibly related PRs
- 8gears/n8n-helm-chart#132 — Restructures chart around main/worker/webhook, adds component PVCs and validation helpers (strong overlap).
- 8gears/n8n-helm-chart#103 — Related CI/workflow changes for chart testing and linting; builds on workflow steps used here.
- 8gears/n8n-helm-chart#130 — README and migration/branding edits that overlap documentation changes.
Suggested reviewers
- Vad1mo
Pre-merge checks (3 passed)
✅ Passed checks (3 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title Check | ✅ Passed | The title succinctly describes the primary focus of the changeset by highlighting improvements to deployment modes and the addition of persistence validation, providing clear and concise context for reviewers at a glance. |
| Description Check | ✅ Passed | The description accurately summarizes the validation logic added to prevent invalid configurations and the documentation enhancements for deployment modes, reflecting the detailed changes in the pull request without being off-topic or overly vague. |
| Docstring Coverage | ✅ Passed | No functions found in the changes. Docstring coverage check skipped. |
[!TIP]
👮 Agentic pre-merge checks are now available in preview!
Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
- Built-in checks – Quickly apply ready-made checks to enforce title conventions, require pull request descriptions that follow templates, validate linked issues for compliance, and more.
- Custom agentic checks – Define your own rules using CodeRabbit’s advanced agentic capabilities to enforce organization-specific policies and workflows. For example, you can instruct CodeRabbit’s agent to verify that API documentation is updated whenever API schema files are modified in a PR. Note: Upto 5 custom checks are currently allowed during the preview period. Pricing for this feature will be announced in a few weeks.
Example:
reviews: pre_merge_checks: custom_checks: - name: "Undocumented Breaking Changes" mode: "warning" instructions: | Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post.
✨ Finishing Touches
🧪 Generate unit tests
- [ ] Create PR with unit tests
- [ ] Post copyable unit tests in a comment
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
Very nice thank you
@Vad1mo what do you think about this PR?
I need some more time to review this..
Isn't this an anti-pattern to n8n high availability setup? According to n8n official documentation about scaling it suggests to use redis instead, are there any benefits of using persistent storage instead of redis for queuing ? Persistence could be useful if you are using one pod IMO
@dominykasn This is just to persist community nodes between pod restarts. Redis is still used as the queue backend in queue mode — nothing changes there. Or maybe I didn’t fully get your concern?
@Vad1mo up
@Vad1mo up
sorry, a bit delayed, can you rebase and I'll have a look at this tonight.
@Vad1mo done, please check this out
@Vad1mo up 😊
You have broken the README.md file, can you take a closer look into it again?
@Vad1mo is this PR still going anywhere? It would address some of my concerns around the statefulset vs persistency things I mentioned in #204 and also #189.
I would propose a much smaller PR that removes the volumes from workers? But I don't want to derail this effort.
Do workers need volumes at all?
Do workers need volumes at all?
Workers can benefit from persistent volumes for community nodes installation. This allows custom nodes to persist between worker restarts, avoiding the need to reinstall them each time. However, users have full control - they can simply set worker.persistence.enabled: false if they don't need persistent storage for workers. The implementation provides flexibility without forcing persistence on users who don't need it.
Do workers need volumes at all?
Workers can benefit from persistent volumes for community nodes installation. This allows custom nodes to persist between worker restarts, avoiding the need to reinstall them each time.
However, users have full control - they can simply set worker.persistence.enabled: false if they don't need persistent storage for workers. The implementation provides flexibility without forcing persistence on users who don't need it.
@aleksandrovpa Thanks for the context!
@Vad1mo The current implementation in main doesn't support that at all (see #204) and leads to a bunch of errors reported in #189 and #204, but I see it is actually addressed by this PR.
@aleksandrovpa can you rebase this once more? Maybe this gets merged then? :)
@aleksandrovpa I just deployed your updated branch, works! :)
@aleksandrovpa I just deployed your updated branch, works! :)
It's awesome! I know that it works well, cause I use it more than 3 month without any problems 😆 Hope it'll help others too :)
@Vad1mo do you know who can ask to merge this?
Hi @aleksandrovpa and @till , Could you please share how you typically install n8n community nodes on both the main and worker instances?
I attempted to do this using a custom n8n Docker image based on n8nio/n8n:1.103.0. Here's what I tried:
FROM n8nio/n8n:1.103.0
USER root
RUN apk update && apk add mysql-client
# Install needed libraries globally
RUN npm install -g jwks-rsa jsonwebtoken
# Set correct permissions
USER node
# Install Community Nodes
WORKDIR /home/node/.n8n/nodes
RUN npm install @mbakgun/n8n-nodes-slack-socket-mode
WORKDIR /home/node
EXPOSE 5678
However, it seems that the n8n-helm-chart overrides the /home/node/.n8n directory during deployment, so I’m unable to find the @mbakgun/n8n-nodes-slack-socket-mode package there afterward.
n8n-helm-chart config
# This is the n8n helm chart
n8n-helm-chart:
enabled: true
imagePullSecrets:
- name: provisioner.registry-secret
image:
repository: docker-custom.com/n8n-oasis/n8n-oasis-img
tag: *imageTag
main:
service:
port: 5678
replicaCount: 1
podLabels:
use-default-egress-policy: "true"
deploymentLabels:
use-default-egress-policy: "true"
deploymentStrategy:
type: RollingUpdate
maxSurge: 25%
maxUnavailable: 0%
resources:
requests:
cpu: 1000m
memory: 2048Mi
limits:
cpu: 2000m
memory: 4096Mi
persistence:
enabled: false
type: emptyDir
accessModes:
- ReadWriteOnce
extraEnv: &extraEnv
N8N_ENCRYPTION_KEY:
valueFrom:
secretKeyRef:
name: n8n-secret
key: N8N_ENCRYPTION_KEY
N8N_RUNNERS_ENABLED:
value: "true"
N8N_RUNNER:
value: "true"
NODE_FUNCTION_ALLOW_EXTERNAL:
value: "jsonwebtoken,jwks-rsa"
N8N_CUSTOM_EXTENSIONS:
value: "true"
N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS:
value: "true"
N8N_PROTOCOL:
value: "https"
QUEUE_BULL_REDIS_HOST:
value: "app-redis-master"
QUEUE_BULL_REDIS_PORT:
value: "6379"
QUEUE_HEALTH_CHECK_ACTIVE:
value: "true"
DB_TYPE:
value: "postgresdb"
DB_POSTGRESDB_DATABASE:
value: "n8n"
DB_POSTGRESDB_HOST:
value: "app-postgresql-ha-pgpool"
DB_POSTGRESDB_PORT:
value: "5432"
DB_POSTGRESDB_USER:
value: "postgres"
DB_POSTGRESDB_PASSWORD:
valueFrom:
secretKeyRef:
name: postgresql-ha-postgresql
key: password
worker:
concurrency: 5
replicaCount: 5
extraEnv:
<<: *extraEnv
N8N_RUNNER:
value: "false"
deploymentStrategy:
type: RollingUpdate
maxSurge: 25%
maxUnavailable: 0%
resources:
requests:
cpu: 1000m
memory: 2048Mi
limits:
cpu: 2000m
memory: 4096Mi
persistence:
enabled: false
type: emptyDir
accessModes:
- ReadWriteOnce
redis:
# Change to `false` to disable in-cluster redis deployment.
enabled: true
# n8n uses redis as a message broker.
# Not Cluster-Aware: Pub/Sub does not work seamlessly across cluster nodes. Subscribers must be connected to the same node where the message is published.
architecture: standalone
## Redis(R) Authentication parameters
## ref: https://github.com/bitnami/containers/tree/main/bitnami/redis#setting-the-server-password-on-first-run
##
auth:
enabled: false
networkPolicy:
enabled: false
usePassword: false
metrics:
enabled: false
postgre-ha:
enabled: true
postgresql-ha:
global:
postgresql:
existingSecret: 'postgresql-ha-postgresql'
pgpool:
existingSecret: 'postgresql-ha-pgpool'
postgresql:
existingSecret: 'postgresql-ha-postgresql'
resources:
requests:
cpu: 1000m
memory: 2048Mi
limits:
cpu: 2000m
memory: 4096Mi
persistence:
size: 50Gi
pgpool:
existingSecret: 'postgresql-ha-pgpool'
Hi @aleksandrovpa and @till , Could you please share how you typically install n8n community nodes on both the main and worker instances?
I attempted to do this using a custom n8n Docker image based on n8nio/n8n:1.103.0. Here's what I tried:
FROM n8nio/n8n:1.103.0 USER root RUN apk update && apk add mysql-client # Install needed libraries globally RUN npm install -g jwks-rsa jsonwebtoken # Set correct permissions USER node # Install Community Nodes WORKDIR /home/node/.n8n/nodes RUN npm install @mbakgun/n8n-nodes-slack-socket-mode WORKDIR /home/node EXPOSE 5678However, it seems that the n8n-helm-chart overrides the /home/node/.n8n directory during deployment, so I’m unable to find the @mbakgun/n8n-nodes-slack-socket-mode package there afterward.
n8n-helm-chart config
# This is the n8n helm chart n8n-helm-chart: enabled: true imagePullSecrets: - name: provisioner.registry-secret image: repository: docker-custom.com/n8n-oasis/n8n-oasis-img tag: *imageTag main: service: port: 5678 replicaCount: 1 podLabels: use-default-egress-policy: "true" deploymentLabels: use-default-egress-policy: "true" deploymentStrategy: type: RollingUpdate maxSurge: 25% maxUnavailable: 0% resources: requests: cpu: 1000m memory: 2048Mi limits: cpu: 2000m memory: 4096Mi persistence: enabled: false type: emptyDir accessModes: - ReadWriteOnce extraEnv: &extraEnv N8N_ENCRYPTION_KEY: valueFrom: secretKeyRef: name: n8n-secret key: N8N_ENCRYPTION_KEY N8N_RUNNERS_ENABLED: value: "true" N8N_RUNNER: value: "true" NODE_FUNCTION_ALLOW_EXTERNAL: value: "jsonwebtoken,jwks-rsa" N8N_CUSTOM_EXTENSIONS: value: "true" N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS: value: "true" N8N_PROTOCOL: value: "https" QUEUE_BULL_REDIS_HOST: value: "app-redis-master" QUEUE_BULL_REDIS_PORT: value: "6379" QUEUE_HEALTH_CHECK_ACTIVE: value: "true" DB_TYPE: value: "postgresdb" DB_POSTGRESDB_DATABASE: value: "n8n" DB_POSTGRESDB_HOST: value: "app-postgresql-ha-pgpool" DB_POSTGRESDB_PORT: value: "5432" DB_POSTGRESDB_USER: value: "postgres" DB_POSTGRESDB_PASSWORD: valueFrom: secretKeyRef: name: postgresql-ha-postgresql key: password worker: concurrency: 5 replicaCount: 5 extraEnv: <<: *extraEnv N8N_RUNNER: value: "false" deploymentStrategy: type: RollingUpdate maxSurge: 25% maxUnavailable: 0% resources: requests: cpu: 1000m memory: 2048Mi limits: cpu: 2000m memory: 4096Mi persistence: enabled: false type: emptyDir accessModes: - ReadWriteOnce redis: # Change to `false` to disable in-cluster redis deployment. enabled: true # n8n uses redis as a message broker. # Not Cluster-Aware: Pub/Sub does not work seamlessly across cluster nodes. Subscribers must be connected to the same node where the message is published. architecture: standalone ## Redis(R) Authentication parameters ## ref: https://github.com/bitnami/containers/tree/main/bitnami/redis#setting-the-server-password-on-first-run ## auth: enabled: false networkPolicy: enabled: false usePassword: false metrics: enabled: false postgre-ha: enabled: true postgresql-ha: global: postgresql: existingSecret: 'postgresql-ha-postgresql' pgpool: existingSecret: 'postgresql-ha-pgpool' postgresql: existingSecret: 'postgresql-ha-postgresql' resources: requests: cpu: 1000m memory: 2048Mi limits: cpu: 2000m memory: 4096Mi persistence: size: 50Gi pgpool: existingSecret: 'postgresql-ha-pgpool'
Hi @viktym,
I’ve tested this recently by adding a new community node via the Web UI, and I can confirm that it appears on both the main and worker instances when I check ls /home/node/.n8n/nodes/node_modules. Here’s my values.yaml configuration for reference:
image:
repository: example.registry.io/n8n
tag: "1.95.2"
pullPolicy: Always
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: "letsencrypt-prod"
hosts:
- host: "n8n.example.com"
paths:
- /
tls:
- hosts:
- "n8n.example.com"
secretName: n8n-tls
main:
command: ["tini", "--", "/docker-entrypoint.sh"]
config:
QUEUE_BULL_REDIS_HOST: "n8n-valkey-primary"
QUEUE_BULL_REDIS_PORT: "6379"
persistence:
enabled: true
type: dynamic
size: 1Gi
storageClass: default
replicaCount: 1
podAnnotations: &podAnnotations
vault.security.banzaicloud.io/vault-path: example-vault-path
vault.security.banzaicloud.io/vault-role: example-vault-role
vault.security.banzaicloud.io/vault-skip-verify: "true"
extraEnv: &extraEnv
NODE_TLS_REJECT_UNAUTHORIZED:
value: "0"
N8N_COMMUNITY_PACKAGES_ALLOW_TOOL_USAGE:
value: "true"
N8N_REINSTALL_MISSING_PACKAGES:
value: "true"
N8N_EXECUTIONS_PRUNE:
value: "true"
N8N_EXECUTIONS_PRUNE_MAX_AGE:
value: "72"
N8N_DISABLE_PRODUCTION_MAIN_PROCESS:
value: "true"
N8N_SMTP_SSL:
value: "false"
N8N_EMAIL_MODE:
value: "smtp"
N8N_SMTP_HOST:
value: "REDACTED_SMTP_HOST"
N8N_SMTP_PORT:
value: "25"
N8N_SMTP_SENDER:
value: "[email protected]"
AUTH_BASIC_ENABLED:
value: "true"
AUTH_BASIC_PASSWORD:
value: "REDACTED_AUTH_BASIC_PASSWORD"
AUTH_BASIC_USERNAME:
value: "REDACTED_AUTH_BASIC_USERNAME"
DB_POSTGRESDB_HOST:
value: "REDACTED_DB_HOST"
DB_POSTGRESDB_PASSWORD:
value: "REDACTED_DB_PASSWORD"
DB_POSTGRESDB_USER:
value: "REDACTED_DB_USER"
DB_POSTGRESDB_DATABASE:
value: "REDACTED_DB_NAME"
DB_POSTGRESDB_PORT:
value: "5432"
DB_TYPE:
value: "postgresdb"
N8N_ENCRYPTION_KEY:
value: "REDACTED_ENCRYPTION_KEY"
WEBHOOK_URL:
value: "https://n8n.example.com"
N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS:
value: "true"
N8N_RUNNERS_ENABLED:
value: "true"
N8N_PORT:
value: "5678"
EXECUTIONS_MODE:
value: "queue"
QUEUE_BULL_REDIS_HOST:
value: "n8n-valkey-primary"
QUEUE_BULL_REDIS_PORT:
value: "6379"
OFFLOAD_MANUAL_EXECUTIONS_TO_WORKERS:
value: "true"
QUEUE_HEALTH_CHECK_ACTIVE:
value: "true"
QUEUE_HEALTH_CHECK_PORT:
value: "5678"
worker:
enabled: true
count: 2
statefulSet:
enabled: true
persistence:
enabled: true
size: 1Gi
storageClass: default
podAnnotations: *podAnnotations
extraEnv: *extraEnv
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 2000m
memory: 6Gi
webhook:
enabled: true
deploymentStrategy:
type: "RollingUpdate"
rollingUpdate:
maxSurge: "25%"
maxUnavailable: "25%"
podAnnotations: *podAnnotations
extraEnv: *extraEnv
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 1Gi
valkey:
enabled: true
architecture: standalone
auth:
enabled: false
primary:
persistence:
enabled: true
size: 1Gi
My setup is pretty straightforward without custom Docker images or init containers. Each pod has its own PVC mounted at /home/node/.n8n, which helps avoid data loss on restarts and keeps instances isolated. I’m not entirely sure why the nodes propagate to workers after UI installation (normally, this isn’t expected in queue mode with separate PVCs), but it works for me. You might want to test this with your setup—perhaps the N8N_REINSTALL_MISSING_PACKAGES: true env variable or the base image (n8n:1.95.2) plays a role. Let me know if you need further assistance!
@Vad1mo can we finally merge this PR? or you still have some concerns about that?
@Vad1mo @RoseSecurity can we nudge this along?
Gentle ping 🫣
I'll do my best to test this change this week. Just a crazy time of the year with vacations. Thanks for your patience and sorry for the delay!
@aleksandrovpa would you kindly rebase one more time? and thank you for this feature 🙌🏼
@Vad1mo & @RoseSecurity, I hope you can find time to review this PR, as it is a much needed feature for anyone who wants to scale n8n components in prod 🙏🏼
Up
I understand that people are busy. But is this ever happening?
Would be great to get this merged, since using a Statefulset is the only viable option for running multiple replicas with persistence without relying on NFS/EFS/etc.