helm-charts
helm-charts copied to clipboard
[BUG][OpenSearch Helm 2.0.1] FailedScheduling : N pod has unbound immediate PersistentVolumeClaims
Describe the bug
NAME READY STATUS RESTARTS AGE
pod/test-opensearch-helm-dashboards-7f498c4684-lld2g 1/1 Running 0 12m
pod/test-opensearch-helm-master-0 0/1 PodInitializing 0 12m
pod/test-opensearch-helm-master-1 0/1 PodInitializing 0 12m
pod/test-opensearch-helm-master-2 0/1 PodInitializing 0 12m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/test-opensearch-helm-dashboards ClusterIP 172.31.193.248 <none> 5601/TCP 12m
service/test-opensearch-helm-master ClusterIP 172.31.113.251 <none> 9200/TCP,9300/TCP 12m
service/test-opensearch-helm-master-headless ClusterIP None <none> 9200/TCP,9300/TCP 12m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/test-opensearch-helm-dashboards 1/1 1 1 12m
NAME DESIRED CURRENT READY AGE
replicaset.apps/test-opensearch-helm-dashboards-7f498c4684 1 1 1 12m
NAME READY AGE
statefulset.apps/test-opensearch-helm-master 0/3 12m
As you can see above, master nodes for OpenSearch Cluster doesn't start as a pod.
I am suspicious of the number 27, because the number of kubernetes nodes is exactly 27.
The available resource of CPU & Memory of each k8s worker node is enough yet.
The logs from each pod(pod/test-opensearch-helm-master-0) are like:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 12m default-scheduler 0/27 nodes are available: 27 pod has unbound immediate PersistentVolumeClaims.
Warning FailedScheduling 12m default-scheduler 0/27 nodes are available: 27 pod has unbound immediate PersistentVolumeClaims.
Normal Scheduled 12m default-scheduler Successfully assigned test-opensearch-helm/test-opensearch-helm-master-2 to ick8ssrep01w003
Normal Pulling 12m kubelet Pulling image "docker-repo.xxx.com/hcp-docker/busybox:latest"
Normal Pulled 12m kubelet Successfully pulled image "docker-repo.xxx.com/hcp-docker/busybox:latest" in 136.527586ms
Normal Created 12m kubelet Created container fsgroup-volume
Normal Started 12m kubelet Started container fsgroup-volume
Normal Pulled 12m kubelet Container image "docker-repo.xxx.com/hcp-docker/opensearchproject/opensearch:2.0.1" already present on machine
Normal Created 12m kubelet Created container opensearch
Normal Started 12m kubelet Started container opensearch
To Reproduce Steps to reproduce the behavior: I've tried to run OpenSearch and its dashboard using Helm Chart (v 2.1.0).
/test-opensearch-helm/namespaces.yaml
apiVersion: v1
kind: Namespace
metadata:
name: test-opensearch-helm
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: xxx-anyuid-hostpath-clusterrole-rolebinding
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: xxx-anyuid-hostpath-psp-clusterrole
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:serviceaccounts:test-opensearch-helm
/test-opensearch-helm/kustomization.yaml
namespace: test-opensearch-helm
bases:
# - ../../../base/common
- ./opensearch/common
- ./opensearch/master
- ./opensearch-dashboards
resources:
- namespaces.yaml
/test-opensearch-helm/opensearch/master/kustomization.yaml
helmGlobals:
chartHome: ../../../../../base/opensearch/charts
helmCharts:
- name: opensearch-2.1.0
version: 2.1.0
releaseName: test-opensearch-helm
namespace: test-opensearch-helm
valuesFile: values.yaml
# includeCRDs: true
/test-opensearch-helm/opensearch/master/values.yaml
---
clusterName: "test-opensearch-helm"
nodeGroup: "master"
# The service that non master groups will try to connect to when joining the cluster
# This should be set to clusterName + "-" + nodeGroup for your master group
masterService: "test-opensearch-helm-master"
# OpenSearch roles that will be applied to this nodeGroup
# These will be set as environment variable "node.roles". E.g. node.roles=master,ingest,data,remote_cluster_client
roles:
- master
- ingest
- data
- remote_cluster_client
# - ml
replicas: 3
majorVersion: "2"
global:
# Set if you want to change the default docker registry, e.g. a private one.
dockerRegistry: ""
# Allows you to add any config files in {{ .Values.opensearchHome }}/config
opensearchHome: /usr/share/opensearch
# such as opensearch.yml and log4j2.properties
config:
# Values must be YAML literal style scalar / YAML multiline string.
# <filename>: |
# <formatted-value(s)>
opensearch.yml: |
cluster.name: test-opensearch-helm
# Bind to all interfaces because we don't know what IP address Docker will assign to us.
network.host: 0.0.0.0
plugins:
security:
ssl:
transport:
pemcert_filepath: /usr/share/opensearch/config/certs/opens.pem
pemkey_filepath: /usr/share/opensearch/config/certs/opens-key.pem
pemtrustedcas_filepath: /usr/share/opensearch/config/certs/root-ca.pem
enforce_hostname_verification: false
http:
enabled: false
pemcert_filepath: /usr/share/opensearch/config/certs/opens.pem
pemkey_filepath: /usr/share/opensearch/config/certs/opens-key.pem
pemtrustedcas_filepath: /usr/share/opensearch/config/certs/root-ca.pem
allow_unsafe_democertificates: true
allow_default_init_securityindex: true
authcz:
admin_dn:
- CN=kirk,OU=client,O=client,L=test,C=de
audit.type: internal_opensearch
enable_snapshot_restore_privilege: true
check_snapshot_restore_write_privileges: true
restapi:
roles_enabled: ["all_access", "security_rest_api_access"]
system_indices:
enabled: true
indices:
[
".opendistro-alerting-config",
".opendistro-alerting-alert*",
".opendistro-anomaly-results*",
".opendistro-anomaly-detector*",
".opendistro-anomaly-checkpoints",
".opendistro-anomaly-detection-state",
".opendistro-reports-*",
".opendistro-notifications-*",
".opendistro-notebooks",
".opendistro-asynchronous-search-response*",
]
# Extra environment variables to append to this nodeGroup
# This will be appended to the current 'env:' key. You can use any of the kubernetes env
# syntax here
extraEnvs:
- name: OPENSEARCH_PASSWORD
valueFrom:
secretKeyRef:
name: opens-credentials
key: password
- name: OPENSEARCH_USERNAME
valueFrom:
secretKeyRef:
name: opens-credentials
key: username
- name: DISABLE_INSTALL_DEMO_CONFIG
value: "true"
# Allows you to load environment variables from kubernetes secret or config map
envFrom: []
# - secretRef:
# name: env-secret
# - configMapRef:
# name: config-map
# A list of secrets and their paths to mount inside the pod
# This is useful for mounting certificates for security and for mounting
# the X-Pack license
secretMounts:
- name: opensearch-cert
secretName: opensearch-cert
path: /usr/share/opensearch/config/certs
defaultMode: 0755
hostAliases: []
# - ip: "127.0.0.1"
# hostnames:
# - "foo.local"
# - "bar.local"
image:
repository: "docker-repo.xxx.com/hcp-docker/opensearchproject/opensearch"
# override image tag, which is .Chart.AppVersion by default
tag: "2.0.1"
pullPolicy: "IfNotPresent"
podAnnotations: {}
# iam.amazonaws.com/role: es-cluster
# additionals labels
labels: {}
opensearchJavaOpts: "-Djava.net.preferIPv4Stack=true -Xms8g -Xmx8g -XX:+UnlockDiagnosticVMOptions -Xlog:gc+heap+coops=info"
resources:
requests:
cpu: "0.1"
memory: "16Gi"
limits:
cpu: "4"
memory: "16Gi"
initResources:
limits:
cpu: "200m"
memory: "50Mi"
requests:
cpu: "200m"
memory: "50Mi"
sidecarResources: {}
networkHost: "0.0.0.0"
rbac:
create: true
serviceAccountAnnotations: {}
serviceAccountName: ""
podSecurityPolicy:
create: true
name: ""
spec:
privileged: true
fsGroup:
rule: RunAsAny
runAsUser:
rule: RunAsAny
seLinux:
rule: RunAsAny
supplementalGroups:
rule: RunAsAny
volumes:
- secret
- configMap
- persistentVolumeClaim
- emptyDir
persistence:
enabled: true
# Set to false to disable the `fsgroup-volume` initContainer that will update permissions on the persistent disk.
enableInitChown: true
# override image, which is busybox by default
image: "docker-repo.xxx.com/hcp-docker/busybox"
# override image tag, which is latest by default
# imageTag:
labels:
# Add default labels for the volumeClaimTemplate of the StatefulSet
enabled: false
# OpenSearch Persistent Volume Storage Class
# If defined, storageClassName: <storageClass>
# If set to "-", storageClassName: "", which disables dynamic provisioning
# If undefined (the default) or set to null, no storageClassName spec is
# set, choosing the default provisioner. (gp2 on AWS, standard on
# GKE, AWS & OpenStack)
#
storageClass: "sc-nfs-app-retain"
accessModes:
- ReadWriteOnce
size: 50Gi
annotations: {}
extraVolumes: []
# - name: extras
# emptyDir: {}
extraVolumeMounts: []
# - name: extras
# mountPath: /usr/share/extras
# readOnly: true
extraContainers: []
# - name: do-something
# image: busybox
# command: ['do', 'something']
extraInitContainers: []
# - name: do-somethings
# image: busybox
# command: ['do', 'something']
# This is the PriorityClass settings as defined in
# https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/#priorityclass
priorityClassName: ""
# By default this will make sure two pods don't end up on the same node
# Changing this to a region would allow you to spread pods across regions
antiAffinityTopologyKey: "kubernetes.io/hostname"
# Hard means that by default pods will only be scheduled if there are enough nodes for them
# and that they will never end up on the same node. Setting this to soft will do this "best effort"
antiAffinity: "soft"
# This is the node affinity settings as defined in
# https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#node-affinity-beta-feature
nodeAffinity: {}
# This is the pod topology spread constraints
# https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/
topologySpreadConstraints: []
# The default is to deploy all pods serially. By setting this to parallel all pods are started at
# the same time when bootstrapping the cluster
podManagementPolicy: "Parallel"
# The environment variables injected by service links are not used, but can lead to slow OpenSearch boot times when
# there are many services in the current namespace.
# If you experience slow pod startups you probably want to set this to `false`.
enableServiceLinks: true
protocol: http
httpPort: 9200
transportPort: 9300
service:
labels: {}
labelsHeadless: {}
headless:
annotations: {}
type: ClusterIP
nodePort: ""
annotations: {}
httpPortName: http
transportPortName: transport
loadBalancerIP: ""
loadBalancerSourceRanges: []
externalTrafficPolicy: ""
updateStrategy: RollingUpdate
# This is the max unavailable setting for the pod disruption budget
# The default value of 1 will make sure that kubernetes won't allow more than 1
# of your pods to be unavailable during maintenance
maxUnavailable: 1
podSecurityContext:
fsGroup: 1000
runAsUser: 1000
securityContext:
capabilities:
drop:
- ALL
# readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
securityConfig:
enabled: true
path: "/usr/share/opensearch/plugins/opensearch-security/securityconfig"
actionGroupsSecret:
configSecret:
internalUsersSecret:
rolesSecret:
rolesMappingSecret:
tenantsSecret:
# The following option simplifies securityConfig by using a single secret and
# specifying the config files as keys in the secret instead of creating
# different secrets for for each config file.
# Note that this is an alternative to the individual secret configuration
# above and shouldn't be used if the above secrets are used.
config:
# There are multiple ways to define the configuration here:
# * If you define anything under data, the chart will automatically create
# a secret and mount it.
# * If you define securityConfigSecret, the chart will assume this secret is
# created externally and mount it.
# * It is an error to define both data and securityConfigSecret.
securityConfigSecret: ""
dataComplete: true
data: {}
# config.yml: |-
# internal_users.yml: |-
# roles.yml: |-
# roles_mapping.yml: |-
# action_groups.yml: |-
# tenants.yml: |-
# How long to wait for opensearch to stop gracefully
terminationGracePeriod: 120
sysctlVmMaxMapCount: 262144
readinessProbe:
failureThreshold: 3
initialDelaySeconds: 10
periodSeconds: 60
successThreshold: 3
timeoutSeconds: 60
## Use an alternate scheduler.
## ref: https://kubernetes.io/docs/tasks/administer-cluster/configure-multiple-schedulers/
##
schedulerName: ""
imagePullSecrets: []
nodeSelector:
worker: "true"
tolerations: []
# Enabling this will publically expose your OpenSearch instance.
# Only enable this if you have security enabled on your cluster
ingress:
enabled: true
# For Kubernetes >= 1.18 you should specify the ingress-controller via the field ingressClassName
# See https://kubernetes.io/blog/2020/04/02/improvements-to-the-ingress-api-in-kubernetes-1.18/#specifying-the-class-of-an-ingress
ingressClassName: nginx
annotations: {}
# kubernetes.io/ingress.class: nginx
# kubernetes.io/tls-acme: "true"
path: /
hosts:
- test-opensearch-helm.srep01.xxx.com
tls: []
# - secretName: chart-example-tls
# hosts:
# - chart-example.local
nameOverride: ""
fullnameOverride: ""
masterTerminationFix: false
lifecycle:
# preStop:
# exec:
# command: ["/bin/sh", "-c", "echo Hello from the postStart handler > /usr/share/message"]
# postStart:
# exec:
# command:
# - bash
# - -c
# - |
# #!/bin/bash
# # Add a template to adjust number of shards/replicas1
# TEMPLATE_NAME=my_template
# INDEX_PATTERN="logstash-*"
# SHARD_COUNT=8
# REPLICA_COUNT=1
# ES_URL=http://localhost:9200
# while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done
# curl -XPUT "$ES_URL/_template/$TEMPLATE_NAME" -H 'Content-Type: application/json' -d'{"index_patterns":['\""$INDEX_PATTERN"\"'],"settings":{"number_of_shards":'$SHARD_COUNT',"number_of_replicas":'$REPLICA_COUNT'}}'
postStart:
exec:
command:
- bash
- -c
- |
#!/bin/bash
# Add a template to adjust number of shards/replicas1
ES_URL=http://admin:admin12~!@localhost:9200
while [[ "$(curl -s -o /dev/null -w '%{http_code}\n' $ES_URL)" != "200" ]]; do sleep 1; done
# _index_template logs-template-app
curl -XPUT "$ES_URL/_index_template/logs-template-app" -H 'Content-Type: application/json' \
-d '{
"index_patterns": [
"app_*",
"sys_*"
],
"data_stream": {
"timestamp_field": {
"name": "logTime"
}
},
"priority": 200,
"template": {
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
}
}
}'
# _index_policy logs-policy-app
curl -XDELETE "$ES_URL/_plugins/_ism/policies/logs-policy-app"
curl -XPUT "$ES_URL/_plugins/_ism/policies/logs-policy-app" -H 'Content-Type: application/json' \
-d '
{
"policy" : {
"description" : "A app log of the policy",
"default_state" : "hot",
"states" : [
{
"name" : "hot",
"actions" : [
{
"retry" : {
"count" : 3,
"backoff" : "exponential",
"delay" : "1m"
},
"rollover" : {
"min_index_age" : "3m"
}
}
],
"transitions" : [
{
"state_name" : "warm",
"conditions" : {
"min_index_age" : "3m"
}
}
]
},
{
"name" : "warm",
"actions" : [
{
"retry" : {
"count" : 3,
"backoff" : "exponential",
"delay" : "1m"
},
"read_only" : { }
}
],
"transitions" : [
{
"state_name" : "delete",
"conditions" : {
"min_rollover_age" : "3m"
}
}
]
},
{
"name" : "delete",
"actions" : [
{
"retry" : {
"count" : 3,
"backoff" : "exponential",
"delay" : "1m"
},
"delete" : { }
}
],
"transitions" : [ ]
}
],
"ism_template" : [
{
"index_patterns" : [
"app_*",
"sys_*"
],
"priority" : 0
}
]
}
}
'
keystore: []
# To add secrets to the keystore:
# - secretName: opensearch-encryption-key
networkPolicy:
create: false
## Enable creation of NetworkPolicy resources. Only Ingress traffic is filtered for now.
## In order for a Pod to access OpenSearch, it needs to have the following label:
## {{ template "uname" . }}-client: "true"
## Example for default configuration to access HTTP port:
## opensearch-master-http-client: "true"
## Example for default configuration to access transport port:
## opensearch-master-transport-client: "true"
http:
enabled: false
# Deprecated
# please use the above podSecurityContext.fsGroup instead
fsGroup: ""
## Set optimal sysctl's. This requires privilege. Can be disabled if
## the system has already been preconfigured. (Ex: https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html)
## Also see: https://kubernetes.io/docs/tasks/administer-cluster/sysctl-cluster/
sysctl:
enabled: false
## Enable to add 3rd Party / Custom plugins not offered in the default OpenSearch image.
plugins:
enabled: false
installList: []
# - example-fake-plugin
# -- Array of extra K8s manifests to deploy
extraObjects: []
Host/Environment (please complete the following information):
- Helm Version: 2.1.0
- Kubernetes Version: 1.20.7
Additional context Add any other context about the problem here.
With the above error, are you able to start the cluster using 2.0.1 OS
@Divyaasm Hi, I deployed with Helm Chart for OpenSearch (version: 2.1.0) and using the image for OpenSearch itself(version: 2.0.1) as below:
image:
repository: "docker-repo.xxx.com/hcp-docker/opensearchproject/opensearch"
# override image tag, which is .Chart.AppVersion by default
tag: "2.0.1"
pullPolicy: "IfNotPresent"
When I first wrote this issue, the number of nodes for Kubernetes cluster was 27.
Warning FailedScheduling 12m default-scheduler 0/27 nodes are available: 27 pod has unbound immediate PersistentVolumeClaims.
Two days ago, new worker node was added to K8s cluster, the logs changed.
Warning FailedScheduling 9m default-scheduler 0/28 nodes are available: 28 pod has unbound immediate PersistentVolumeClaims.