helm-charts
helm-charts copied to clipboard
🔹🐛 Operator default listeners seem to be clashing with user provided ports
What happened?
I used ports 30081
and 30082
in "external" listeners and the operator complained that the port was already used:
│ manager {"level":"debug","ts":"2024-04-01T21:32:22.924Z","logger":"events","msg":"Helm upgrade failed for release red ││ panda/redpanda with chart [email protected]: failed to create resource: Service \"redpanda-external\" is invalid: spec. ││ ports[4].nodePort: Invalid value: 30082: provided port is already allocated\n\nLast Helm logs:\n\n2024-04-01T21:32:22 ││ .501865617Z: Created a new PodDisruptionBudget called \"redpanda\" in redpanda\n\n2024-04-01T21:32:22.522146873Z: Cre ││ ated a new ServiceAccount called \"id-rpcloud-9m4e2mr0ui3e8a215n4\" in redpanda\n\n2024-04-01T21:32:22.540660454Z: Cr ││ eated a new Secret called \"redpanda-sts-lifecycle\" in redpanda\n\n2024-04-01T21:32:22.557186641Z: Created a new Sec ││ ret called \"redpanda-config-watcher\" in redpanda\n\n2024-04-01T21:32:22.576038741Z: Created a new Secret called \"r ││ edpanda-configurator\" in redpanda\n\n2024-04-01T21:32:22.592036011Z: Created a new Secret called \"redpanda-fs-valid ││ ator\" in redpanda\n\n2024-04-01T21:32:22.610714711Z: Created a new ConfigMap called \"redpanda\" in redpanda\n\n2024 ││ -04-01T21:32:22.626412621Z: Created a new ConfigMap called \"redpanda-rpk\" in redpanda\n\n2024-04-01T21:32:22.648893 ││ 55Z: Created a new Service called \"redpanda\" in redpanda\n\n2024-04-01T21:32:22.778871216Z: warning: Upgrade \"redp ││ anda\" failed: failed to create resource: Service \"redpanda-external\" is invalid: spec.ports[4].nodePort: Invalid v │
│ alue: 30082: provided port is already allocated","type":"Warning","object":{"kind":"HelmRelease","namespace":"redpand │
│ a","name":"redpanda","uid":"9b7006ec-60b7-496b-b21c-0ee3064f8e6d","apiVersion":"helm.toolkit.fluxcd.io/v2beta2","reso │
│ urceVersion":"11709469"},"reason":"UpgradeFailed"} │
│
What did you expect to happen?
If I can provide external ports, I expect the operator to honor them. Any hidden magic is highly undesired.
How can we reproduce it (as minimally and precisely as possible)?. Please include values file.
$ helm get values <redpanda-release-name> -n <redpanda-release-namespace> --all
COMPUTED VALUES:
affinity: {}
auditLogging:
clientMaxBufferSize: 16777216
enabled: false
enabledEventTypes: null
excludedPrincipals: null
excludedTopics: null
listener: internal
partitions: 12
queueDrainIntervalMs: 500
queueMaxBufferSizePerShard: 1048576
replicationFactor: null
auth:
sasl:
enabled: false
mechanism: SCRAM-SHA-512
secretRef: redpanda/redpanda-superusers
users: []
clusterDomain: cluster.local
commonLabels: {}
config:
cluster:
cloud_storage_azure_container: 9m4e2mr0ui3e8a215n4g
cloud_storage_azure_storage_account: testcamilo9
cloud_storage_credentials_source: azure_aks_oidc_federation
cloud_storage_enable_remote_read: "true"
cloud_storage_enable_remote_write: "true"
cloud_storage_enabled: "false"
default_topic_replications: "3"
minimum_topic_replications: "3"
node:
crash_loop_limit: 5
pandaproxy_client: {}
rpk: {}
schema_registry_client: {}
tunable:
compacted_log_segment_size: 67108864
group_topic_partitions: 16
kafka_batch_max_bytes: 1048576
kafka_connection_rate_limit: 1000
log_segment_size: 134217728
log_segment_size_max: 268435456
log_segment_size_min: 16777216
max_compacted_log_segment_size: 536870912
topic_partitions_per_shard: 1000
connectors:
deployment:
create: false
enabled: false
test:
create: false
console:
config: {}
configmap:
create: false
deployment:
create: false
enabled: false
secret:
create: false
enterprise:
license: ""
licenseSecretRef:
key: license
name: redpanda-9m4e2mr0ui3e8a215n4g-license
external:
addresses:
- $PREFIX_TEMPLATE
domain: camilo.panda.dev
enabled: true
externalDns:
enabled: true
prefixTemplate: rp${POD_ORDINAL}-$(echo -n $HOST_IP_ADDRESS | sha256sum | head -c
7)
service:
enabled: true
type: NodePort
fullnameOverride: ""
image:
pullPolicy: IfNotPresent
repository: docker.redpanda.com/redpandadata/redpanda
tag: v23.3.7
imagePullSecrets: []
license_key: ""
license_secret_ref: {}
listeners:
admin:
external:
admin-api:
advertisedPorts:
- 30644
authenticationMethod: sasl
enabled: false
port: 30644
tls:
cert: letsencrypt
enabled: true
requireClientAuth: false
default:
advertisedPorts:
- 31644
port: 9645
tls:
cert: external
port: 9644
tls:
cert: letsencrypt
enabled: true
requireClientAuth: false
http:
authenticationMethod: http_basic
enabled: true
external:
default:
advertisedPorts:
- 30082
authenticationMethod: null
port: 8083
tls:
cert: external
requireClientAuth: false
http-proxy:
advertisedPorts:
- 30082
authenticationMethod: http_basic
enabled: true
port: 30082
tls:
cert: letsencrypt
enabled: true
requireClientAuth: false
kafkaEndpoint: default
port: 8082
prefixTemplate: http-proxy$POD_ORDINAL
tls:
cert: letsencrypt
enabled: true
requireClientAuth: false
kafka:
authenticationMethod: sasl
external:
default:
advertisedPorts:
- 31092
authenticationMethod: null
port: 9094
tls:
cert: external
kafka-api:
advertisedPorts:
- 30092
authenticationMethod: sasl
enabled: true
port: 30092
tls:
cert: letsencrypt
requireClientAuth: false
port: 9092
prefixTemplate: kafka-api$POD_ORDINAL
tls:
cert: letsencrypt
requireClientAuth: false
rpc:
port: 33145
tls:
cert: letsencrypt
requireClientAuth: false
schemaRegistry:
authenticationMethod: http_basic
enabled: true
external:
default:
advertisedPorts:
- 30081
authenticationMethod: null
port: 8084
tls:
cert: external
requireClientAuth: false
schema-registry:
advertisedPorts:
- 30081
authenticationMethod: http_basic
enabled: true
port: 30081
tls:
cert: letsencrypt
requireClientAuth: false
kafkaEndpoint: default
port: 8081
tls:
cert: letsencrypt
requireClientAuth: false
logging:
logLevel: debug
usageStats:
clusterId: 9m4e2mr0ui3e8a215n4g
enabled: true
monitoring:
enabled: false
labels: {}
scrapeInterval: 30s
tlsConfig: {}
nameOverride: ""
nodeSelector: {}
post_install_job:
affinity: {}
enabled: true
post_upgrade_job:
affinity: {}
enabled: true
rackAwareness:
enabled: true
nodeAnnotation: topology.kubernetes.io/zone
rbac:
annotations: {}
enabled: false
resources:
cpu:
cores: "8"
memory:
container:
max: 2Gi
min: 2Gi
serviceAccount:
annotations:
azure.workload.identity/client-id: c90db393-857d-41d0-ac0d-0e61271fcaa6
create: true
name: id-rpcloud-9m4e2mr0ui3e8a215n4
statefulset:
additionalRedpandaCmdFlags:
- --abort-on-seastar-bad-alloc
- --dump-memory-diagnostics-on-alloc-failure-kind=all
annotations: {}
budget:
maxUnavailable: 1
extraVolumeMounts: ""
extraVolumes: ""
initContainerImage:
repository: busybox
tag: latest
initContainers:
configurator:
extraVolumeMounts: ""
resources: {}
extraInitContainers: ""
fsValidator:
enabled: true
expectedFS: xfs
extraVolumeMounts: ""
resources: {}
setDataDirOwnership:
enabled: true
extraVolumeMounts: ""
resources: {}
setTieredStorageCacheDirOwnership:
extraVolumeMounts: ""
resources: {}
tuning:
extraVolumeMounts: ""
resources: {}
livenessProbe:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 10
nodeSelector:
cloud.redpanda.com/role: redpanda
podAffinity: {}
podAntiAffinity:
custom: {}
topologyKey: kubernetes.io/hostname
type: hard
weight: 100
priorityClassName: ""
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 1
periodSeconds: 10
successThreshold: 1
replicas: 3
securityContext:
fsGroup: 101
fsGroupChangePolicy: OnRootMismatch
runAsUser: 101
sideCars:
configWatcher:
enabled: true
extraVolumeMounts: ""
resources: {}
securityContext: {}
controllers:
createRBAC: true
enabled: false
healthProbeAddress: :8085
image:
repository: docker.redpanda.com/redpandadata/redpanda-operator
tag: v2.1.10-23.2.18
metricsAddress: :9082
resources: {}
run:
- all
securityContext: {}
startupProbe:
failureThreshold: 120
initialDelaySeconds: 1
periodSeconds: 10
terminationGracePeriodSeconds: 90
tolerations:
- effect: NoSchedule
key: cloud.redpanda.com/role
operator: Equal
value: redpanda
topologySpreadConstraints:
- maxSkew: 1
topologyKey: topology.kubernetes.io/zone
whenUnsatisfiable: ScheduleAnyway
updateStrategy:
type: RollingUpdate
storage:
hostPath: ""
persistentVolume:
annotations: {}
enabled: true
labels: {}
nameOverwrite: ""
size: 4096Gi
storageClass: local-path
tiered:
config:
cloud_storage_access_key: ""
cloud_storage_api_endpoint: ""
cloud_storage_azure_container: null
cloud_storage_azure_shared_key: null
cloud_storage_azure_storage_account: null
cloud_storage_bucket: ""
cloud_storage_cache_size: 5368709120
cloud_storage_credentials_source: config_file
cloud_storage_enable_remote_read: true
cloud_storage_enable_remote_write: true
cloud_storage_enabled: false
cloud_storage_region: ""
cloud_storage_secret_key: ""
credentialsSecretRef:
accessKey:
configurationKey: cloud_storage_access_key
secretKey:
configurationKey: cloud_storage_secret_key
hostPath: ""
mountType: persistentVolume
persistentVolume:
annotations: {}
labels: {}
storageClass: local-path
tests:
enabled: true
tls:
certs:
default:
caEnabled: true
external:
caEnabled: true
letsencrypt:
caEnabled: false
duration: 43800h0m0s
issuerRef:
kind: ClusterIssuer
name: letsencrypt-dns-prod
enabled: true
tolerations: []
tuning:
tune_aio_events: true
Anything else we need to know?
No response
Which are the affected charts?
Operator
Chart Version(s)
$ helm -n <redpanda-release-namespace> list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
redpanda-operator redpanda 2 2024-04-01 16:29:38.92053 -0400 EDT deployed operator-0.4.20 v2.1.15-23.3.7
Cloud provider
JIRA Link: K8S-129
It's not operator nor helm-chart responsibility to handle node port conflict.
Please attach kubectl get svc -A -o yaml
output to this issue. I wonder if any redpanda helm chart helm release is still in your cluster as left over.
The Redpanda resource spec to solve this issue.
I've re-wrapped the error messages from Camilo:
{
"level": "debug",
"ts": "2024-04-01T21:32:22.924Z",
"logger": "events",
"msg": "Helm upgrade failed for release redpanda/redpanda with chart [email protected]: failed to create resource: Service \"redpanda-external\" is invalid: spec.ports[4].nodePort: Invalid value: 30082: provided port is already allocated\n\nLast Helm logs:\n\n2024-04-01T21:32:22.501865617Z: Created a new PodDisruptionBudget called \"redpanda\" in redpanda\n\n2024-04-01T21:32:22.522146873Z: Created a new ServiceAccount called \"id-rpcloud-9m4e2mr0ui3e8a215n4\" in redpanda\n\n2024-04-01T21:32:22.540660454Z: Created a new Secret called \"redpanda-sts-lifecycle\" in redpanda\n\n2024-04-01T21:32:22.557186641Z: Created a new Secret called \"redpanda-config-watcher\" in redpanda\n\n2024-04-01T21:32:22.576038741Z: Created a new Secret called \"redpanda-configurator\" in redpanda\n\n2024-04-01T21:32:22.592036011Z: Created a new Secret called \"redpanda-fs-validator\" in redpanda\n\n2024-04-01T21:32:22.610714711Z: Created a new ConfigMap called \"redpanda\" in redpanda\n\n2024-04-01T21:32:22.626412621Z: Created a new ConfigMap called \"redpanda-rpk\" in redpanda\n\n2024-04-01T21:32:22.64889355Z: Created a new Service called \"redpanda\" in redpanda\n\n2024-04-01T21:32:22.778871216Z: warning: Upgrade \"redpanda\" failed: failed to create resource: Service \"redpanda-external\" is invalid: spec.ports[4].nodePort: Invalid value: 30082: provided port is already allocated",
"type": "Warning",
"object": {
"kind": "HelmRelease",
"namespace": "redpanda",
"name": "redpanda",
"uid": "9b7006ec-60b7-496b-b21c-0ee3064f8e6d",
"apiVersion": "helm.toolkit.fluxcd.io/v2beta2",
"resourceVersion": "11709469"
},
"reason": "UpgradeFailed"
}
Helm upgrade failed for release redpanda/redpanda with chart [email protected]: failed to create resource: Service "redpanda-external" is invalid: spec.ports[4].nodePort: Invalid value: 30082: provided port is already allocated
Last Helm logs:
2024-04-01T21:32:22.501865617Z: Created a new PodDisruptionBudget called "redpanda" in redpanda
2024-04-01T21:32:22.522146873Z: Created a new ServiceAccount called "id-rpcloud-9m4e2mr0ui3e8a215n4" in redpanda
2024-04-01T21:32:22.540660454Z: Created a new Secret called "redpanda-sts-lifecycle" in redpanda
2024-04-01T21:32:22.557186641Z: Created a new Secret called "redpanda-config-watcher" in redpanda
2024-04-01T21:32:22.576038741Z: Created a new Secret called "redpanda-configurator" in redpanda
2024-04-01T21:32:22.592036011Z: Created a new Secret called "redpanda-fs-validator" in redpanda
2024-04-01T21:32:22.610714711Z: Created a new ConfigMap called "redpanda" in redpanda
2024-04-01T21:32:22.626412621Z: Created a new ConfigMap called "redpanda-rpk" in redpanda
2024-04-01T21:32:22.64889355Z: Created a new Service called "redpanda" in redpanda
2024-04-01T21:32:22.778871216Z: warning: Upgrade "redpanda" failed: failed to create resource: Service "redpanda-external" is invalid: spec.ports[4].nodePort: Invalid value: 30082: provided port is already allocated
Interesting, I am seeing the same behavior
helm upgrade --install redpanda charts/redpanda -n redpanda --create-namespace --values 1124.yaml
Release "redpanda" does not exist. Installing it now.
Error: 1 error occurred:
* Service "redpanda-external" is invalid: spec.ports[4].nodePort: Invalid value: 30082: provided port is already allocated
templating this shows the following
# Source: redpanda/templates/services.nodeport.yaml
apiVersion: v1
kind: Service
metadata:
name: redpanda-external
namespace: "redpanda"
labels:
app.kubernetes.io/component: redpanda
app.kubernetes.io/instance: redpanda
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: redpanda
helm.sh/chart: redpanda-5.7.37
spec:
type: NodePort
publishNotReadyAddresses: true
externalTrafficPolicy: Local
sessionAffinity: None
ports:
- name: admin-default
protocol: TCP
port: 9645
nodePort: 31644
- name: kafka-default
protocol: TCP
port: 9094
nodePort: 31092
- name: kafka-kafka-api
protocol: TCP
port: 30092
nodePort: 30092
- name: http-default
protocol: TCP
port: 8083
nodePort: 30082
- name: http-http-proxy
protocol: TCP
port: 30082
nodePort: 30082
- name: schema-default
protocol: TCP
port: 8084
nodePort: 30081
- name: schema-schema-registry
protocol: TCP
port: 30081
nodePort: 30081
selector:
app.kubernetes.io/name: redpanda
app.kubernetes.io/instance: "redpanda"
app.kubernetes.io/component: redpanda-statefulset
I think the problem is that there is two entries with the same nodeport.
when i apply the above file only in a clean installation
k apply -f a.yaml
The Service "redpanda-external" is invalid: spec.ports[4].nodePort: Invalid value: 30082: provided port is already allocated
so i think this is the problem.
To make this work i made the following changes to your values
schemaRegistry:
authenticationMethod: http_basic
enabled: true
external:
default:
advertisedPorts:
- 30084
and
http:
authenticationMethod: http_basic
enabled: true
external:
default:
advertisedPorts:
- 30083
authenticationMethod: null
port: 8083
I believe this is just input error and no "magic" on our end. Perhaps the port list is a bit confusing, its something we have wanted to change for a while now.
@alejandroEsc could we add some validation in that case? This feels like a pretty sharp edge.
Im not sure what we agreed to for this ticket, if the idea is to just shut off service creation to allow for this values file to write out to the internal redpanda.yaml (even though the external is not correct for k8s) then you can proceed by
# -- Service allows you to manage the creation of an external kubernetes service object
service:
# -- Enabled if set to false will not create the external service type
# You can still set your cluster with external access but not create the supporting service (NodePort/LoadBalander).
# Set this to false if you rather manage your own service.
enabled: false
if that is the case then we can close this ticket. Otherwise we can help with documentation. I am not convinced that additional validation would help this situation.
If you let me disable the operator 's default listeners, I'll be on my way. I don't need them but I need the ports, to keep them aligned with AWS's and GCP's.
If you let me disable the operator 's default listeners, I'll be on my way. I don't need them but I need the ports, to keep them aligned with AWS's and GCP's.
let's talk and see if we can figure out what you require, im not sure we can disable listeners, never tried.
@c4milo if you have problem configuring node ports please re-open this issue.