compliantkubernetes-kubespray
compliantkubernetes-kubespray copied to clipboard
Create release Compliant Kubernetes Kubespray v2.24.1-ck8s3
Overview
[!note] Whenever you need to change access from operator admin to
[email protected]
prefer to re-login by clearing the~/.kube/cache/oidc-login
cache instead of impersonation[email protected]
.
- Pre-QA steps
- Install QA steps
- Upgrade QA steps
- Post-QA steps
- Release steps
# Pre-QA steps
- [x] Complete the feature freeze step
- [x] Complete the staging step
- [x] Complete all pre-QA steps in the internal checklist
# Install QA steps
Kubespray install scenario
Infrastructure provider
- [ ] Elastx
- [ ] Safespring
- [x] UpCloud
Configuration
-
[x] Flavor - Prod
-
[ ] Dex IdP - Google
-
[ ] Dex Static User - Enabled and
[email protected]
added as an application developerCommands
# configure yq4 -i '.grafana.user.oidc.allowedDomains += ["example.com"]' "${CK8S_CONFIG_PATH}/sc-config.yaml" yq4 -i 'with(.opensearch.extraRoleMappings[]; with(select(.mapping_name != "all_access"); .definition.users += ["[email protected]"]))' "${CK8S_CONFIG_PATH}/sc-config.yaml" yq4 -i '.user.adminUsers += ["[email protected]"]' "${CK8S_CONFIG_PATH}/wc-config.yaml" yq4 -i '.dex.enableStaticLogin = true' "${CK8S_CONFIG_PATH}/sc-config.yaml" pushd ~/path/to/apps/ # apply ./bin/ck8s apply sc ./bin/ck8s apply wc popd
-
[ ] Set the environment variable
NAMESPACE
to an application developer namespace (this cannot be a subnamespace) -
[ ] Set the environment variable
DOMAIN
to the environment domain
Automated tests
[!note] As platform administrator
- [ ] Successful
./bin/ck8s test sc|wc
- [ ] From
tests/
successfulmake build-main
- [ ] From
tests/
successfulmake ctr-run-end-to-end
Kubernetes access
[!note] As platform administrator
- [ ] Can login as platform administrator via Dex with IdP
[!note] As application developer
[email protected]
-
[ ] Can login as application developer
[email protected]
via Dex with static user -
[ ] Can list access
kubectl -n "${NAMESPACE}" auth can-i --list
-
[ ] Can delegate admin access
$ kubectl -n "${NAMESPACE}" edit rolebinding extra-workload-admins # Add some subject subjects: # You can specify more than one "subject" - kind: User name: jane # "name" is case sensitive apiGroup: rbac.authorization.k8s.io
-
[ ] Can delegate view access
$ kubectl edit clusterrolebinding extra-user-view # Add some subject subjects: # You can specify more than one "subject" - kind: User name: jane # "name" is case sensitive apiGroup: rbac.authorization.k8s.io
-
[ ] Cannot run with root by default
kubectl apply -n "${NAMESPACE}" -f - <<EOF --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-root-nginx spec: podSelector: matchLabels: app: root-nginx policyTypes: - Ingress - Egress ingress: - {} egress: - {} --- apiVersion: v1 kind: Pod metadata: labels: app: root-nginx name: root-nginx spec: restartPolicy: Never containers: - name: nginx image: nginx:stable resources: requests: memory: 64Mi cpu: 250m limits: memory: 128Mi cpu: 500m EOF
Hierarchical Namespaces
[!note] As application developer
[email protected]
-
[ ] Can create a subnamespace by following the application developer docs
Commands
kubectl apply -n "${NAMESPACE}" -f - <<EOF apiVersion: hnc.x-k8s.io/v1alpha2 kind: SubnamespaceAnchor metadata: name: ${NAMESPACE}-qa-test EOF kubectl get ns "${NAMESPACE}-qa-test" kubectl get subns -n "${NAMESPACE}" "${NAMESPACE}-qa-test" -o yaml
-
[ ] Ensure the default roles, rolebindings, and networkpolicies propagated
Commands
kubectl get role,rolebinding,netpol -n "${NAMESPACE}" kubectl get role,rolebinding,netpol -n "${NAMESPACE}-qa-test"
Harbor
[!note] As application developer
[email protected]
-
[ ] Can login as application developer via Dex with static user
Steps
-
Login to Harbor with
[email protected]
xdg-open "https://harbor.${DOMAIN}"
-
Login to Harbor with the admin user and promote
[email protected]
to admin -
Re-login with
[email protected]
-
-
[ ] Can create projects and push images by following the application developer docs
-
[ ] Can configure image pull secret by following the application developer docs
-
[ ] Can scan image for vulnerabilities
-
[ ] Configure project to disallow vulnerabilities
-
Try to pull image with vulnerabilities, should fail
docker pull "harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo:${TAG}"
-
-
[ ] Configure project to allow vulnerabilities
-
Try to pull image with vulnerabilities, should succeed
docker pull "harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo:${TAG}"
-
Gatekeeper
[!note] As application developer
[email protected]
-
[ ] Can list OPA rules
kubectl get constraints
[!note] Using the user demo helm chart
Set
NAMESPACE
to an application developer namespaces SetPUBLIC_DOCS_PATH
to the path of the public docs repo
-
[ ] With invalid image repository, try to deploy, should warn due to constraint
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag="${TAG}" \ --set ingress.hostname="demoapp.${DOMAIN}"
-
[ ] With invalid image tag, try to deploy, should fail due to constraint
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag=latest \ --set ingress.hostname="demoapp.${DOMAIN}"
-
[ ] With unset networkpolicies, try to deploy, should warn due to constraint
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag="${TAG}" \ --set ingress.hostname="demoapp.${DOMAIN}" \ --set networkPolicy.enabled=false
-
[ ] With unset resources, try to deploy, should fail due to constraint
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag="${TAG}" \ --set ingress.hostname="demoapp.${DOMAIN}" \ --set resources.requests=null
-
[ ] With valid values, try to deploy, should succeed
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag="${TAG}" \ --set ingress.hostname="demoapp.${DOMAIN}"
cert-manager and ingress-nginx
[!note] As platform administrator
- [ ] All certificates ready including user demo
- [ ] All ingresses ready including user demo
- [ ] Endpoints are reachable
- [ ] Status includes correct IP addresses
Metrics
[!note] As platform administrator
- [ ] Can login to platform administrator Grafana via Dex with IdP
- [ ] Dashboards are available and viewable
- [ ] Metrics are available from all clusters
[!note] As application developer
[email protected]
-
[ ] Can login to application developer Grafana via Dex with static user
Steps
-
Login to Grafana with
[email protected]
xdg-open "https://grafana.${DOMAIN}"
-
Login to Grafana with the admin user and promote
[email protected]
to admin -
Re-login with
[email protected]
-
-
[ ] Welcome dashboard presented first
-
[ ] Dashboards are available and viewable
-
[ ] Metrics are available from all clusters
-
[ ] Metrics are available from user demo application
Alerts
[!note] As platform administrator
- [ ] No alert open except
Watchdog
,CPUThrottlingHigh
andFalcoAlert
- Can be seen in the alert section in platform administrator Grafana
[!note] As application developer
[email protected]
- [ ] Access Prometheus following the application developer docs
- [ ] Prometheus picked up user demo ServiceMonitor and PrometheusRule
- [ ] Access Alertmanager following the application developer docs
- [ ] Alertmanager
Watchdog
firing
Logs
[!note] As platform administrator
- [ ] Can login to OpenSearch Dashboards via Dex with IdP
- [ ] Indices created (authlog, kubeaudit, kubernetes, other)
- [ ] Indices managed (authlog, kubeaudit, kubernetes, other)
- [ ] Logs available (authlog, kubeaudit, kubernetes, other)
- [ ] Snapshots configured
[!note] As application developer
[email protected]
- [ ] Can login to OpenSearch Dashboards via Dex with static user
- [ ] Welcome dashboard presented first
- [ ] Logs available (kubeaudit, kubernetes)
- [ ] CISO dashboards available and working
Falco
[!note] As platform administrator
-
[ ] Deploy the falcosecurity/event-generator to generate events in wc
Commands
# Install kubectl create namespace event-generator kubectl label namespace event-generator owner=operator helm repo add falcosecurity https://falcosecurity.github.io/charts helm repo update helm -n event-generator install event-generator falcosecurity/event-generator \ --set securityContext.runAsNonRoot=true \ --set securityContext.runAsGroup=65534 \ --set securityContext.runAsUser=65534 \ --set podSecurityContext.fsGroup=65534 \ --set config.actions="" # Uninstall helm -n event-generator uninstall event-generator kubectl delete namespace event-generator
-
[ ] Logs are available in OpenSearch Dashboards
-
[ ] Logs are relevant
Network policies
- [ ] No dropped packets in NetworkPolicy Grafana dashboard
Infrastructure tests
- [ ] Able to run
terraform plan
without changes - [ ] Able to add nodes without issues
- [ ] Able to remove nodes without issues
# Upgrade QA steps
Kubespray upgrade scenario
[!note] The upgrade is done as part of the checklist.
Infrastructure provider
- [ ] Elastx
- [ ] Safespring
- [x] UpCloud
Configuration
-
[x] Flavor - Prod
-
[ ] Dex IdP - Google
-
[ ] Dex Static User - Enabled and
[email protected]
added as an application developerCommands
# configure yq4 -i '.grafana.user.oidc.allowedDomains += ["example.com"]' "${CK8S_CONFIG_PATH}/sc-config.yaml" yq4 -i 'with(.opensearch.extraRoleMappings[]; with(select(.mapping_name != "all_access"); .definition.users += ["[email protected]"]))' "${CK8S_CONFIG_PATH}/sc-config.yaml" yq4 -i '.user.adminUsers += ["[email protected]"]' "${CK8S_CONFIG_PATH}/wc-config.yaml" yq4 -i '.dex.enableStaticLogin = true' "${CK8S_CONFIG_PATH}/sc-config.yaml" pushd ~/path/to/apps/ # apply ./bin/ck8s apply sc ./bin/ck8s apply wc popd
-
[ ] Set the environment variable
NAMESPACE
to an application developer namespace (this cannot be a subnamespace) -
[ ] Set the environment variable
DOMAIN
to the environment domain
Upgrade
- [ ] Can upgrade according to the migration docs for this version
Automated tests
[!note] As platform administrator
- [ ] Successful
./bin/ck8s test sc|wc
- [ ] From
tests/
successfulmake build-main
- [ ] From
tests/
successfulmake ctr-run-end-to-end
Kubernetes access
[!note] As platform administrator
- [ ] Can login as platform administrator via Dex with IdP
[!note] As application developer
[email protected]
-
[ ] Can login as application developer
[email protected]
via Dex with static user -
[ ] Can list access
kubectl -n "${NAMESPACE}" auth can-i --list
-
[ ] Can delegate admin access
$ kubectl -n "${NAMESPACE}" edit rolebinding extra-workload-admins # Add some subject subjects: # You can specify more than one "subject" - kind: User name: jane # "name" is case sensitive apiGroup: rbac.authorization.k8s.io
-
[ ] Can delegate view access
$ kubectl edit clusterrolebinding extra-user-view # Add some subject subjects: # You can specify more than one "subject" - kind: User name: jane # "name" is case sensitive apiGroup: rbac.authorization.k8s.io
-
[ ] Cannot run with root by default
kubectl apply -n "${NAMESPACE}" -f - <<EOF --- apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: allow-root-nginx spec: podSelector: matchLabels: app: root-nginx policyTypes: - Ingress - Egress ingress: - {} egress: - {} --- apiVersion: v1 kind: Pod metadata: labels: app: root-nginx name: root-nginx spec: restartPolicy: Never containers: - name: nginx image: nginx:stable resources: requests: memory: 64Mi cpu: 250m limits: memory: 128Mi cpu: 500m EOF
Hierarchical Namespaces
[!note] As application developer
[email protected]
-
[ ] Can create a subnamespace by following the application developer docs
Commands
kubectl apply -n "${NAMESPACE}" -f - <<EOF apiVersion: hnc.x-k8s.io/v1alpha2 kind: SubnamespaceAnchor metadata: name: ${NAMESPACE}-qa-test EOF kubectl get ns "${NAMESPACE}-qa-test" kubectl get subns -n "${NAMESPACE}" "${NAMESPACE}-qa-test" -o yaml
-
[ ] Ensure the default roles, rolebindings, and networkpolicies propagated
Commands
kubectl get role,rolebinding,netpol -n "${NAMESPACE}" kubectl get role,rolebinding,netpol -n "${NAMESPACE}-qa-test"
Harbor
[!note] As application developer
[email protected]
-
[ ] Can login as application developer via Dex with static user
Steps
-
Login to Harbor with
[email protected]
xdg-open "https://harbor.${DOMAIN}"
-
Login to Harbor with the admin user and promote
[email protected]
to admin -
Re-login with
[email protected]
-
-
[ ] Can create projects and push images by following the application developer docs
-
[ ] Can configure image pull secret by following the application developer docs
-
[ ] Can scan image for vulnerabilities
-
[ ] Configure project to disallow vulnerabilities
-
Try to pull image with vulnerabilities, should fail
docker pull "harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo:${TAG}"
-
-
[ ] Configure project to allow vulnerabilities
-
Try to pull image with vulnerabilities, should succeed
docker pull "harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo:${TAG}"
-
Gatekeeper
[!note] As application developer
[email protected]
-
[ ] Can list OPA rules
kubectl get constraints
[!note] Using the user demo helm chart
Set
NAMESPACE
to an application developer namespaces SetPUBLIC_DOCS_PATH
to the path of the public docs repo
-
[ ] With invalid image repository, try to deploy, should warn due to constraint
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag="${TAG}" \ --set ingress.hostname="demoapp.${DOMAIN}"
-
[ ] With invalid image tag, try to deploy, should fail due to constraint
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag=latest \ --set ingress.hostname="demoapp.${DOMAIN}"
-
[ ] With unset networkpolicies, try to deploy, should warn due to constraint
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag="${TAG}" \ --set ingress.hostname="demoapp.${DOMAIN}" \ --set networkPolicy.enabled=false
-
[ ] With unset resources, try to deploy, should fail due to constraint
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag="${TAG}" \ --set ingress.hostname="demoapp.${DOMAIN}" \ --set resources.requests=null
-
[ ] With valid values, try to deploy, should succeed
helm -n "${NAMESPACE}" upgrade --atomic --install demo "${PUBLIC_DOCS_PATH}/user-demo/deploy/ck8s-user-demo" \ --set image.repository="harbor.${DOMAIN}/${REGISTRY_PROJECT}/ck8s-user-demo" \ --set image.tag="${TAG}" \ --set ingress.hostname="demoapp.${DOMAIN}"
cert-manager and ingress-nginx
[!note] As platform administrator
- [ ] All certificates ready including user demo
- [ ] All ingresses ready including user demo
- [ ] Endpoints are reachable
- [ ] Status includes correct IP addresses
Metrics
[!note] As platform administrator
- [ ] Can login to platform administrator Grafana via Dex with IdP
- [ ] Dashboards are available and viewable
- [ ] Metrics are available from all clusters
[!note] As application developer
[email protected]
-
[ ] Can login to application developer Grafana via Dex with static user
Steps
-
Login to Grafana with
[email protected]
xdg-open "https://grafana.${DOMAIN}"
-
Login to Grafana with the admin user and promote
[email protected]
to admin -
Re-login with
[email protected]
-
-
[ ] Welcome dashboard presented first
-
[ ] Dashboards are available and viewable
-
[ ] Metrics are available from all clusters
-
[ ] Metrics are available from user demo application
Alerts
[!note] As platform administrator
- [ ] No alert open except
Watchdog
,CPUThrottlingHigh
andFalcoAlert
- Can be seen in the alert section in platform administrator Grafana
[!note] As application developer
[email protected]
- [ ] Access Prometheus following the application developer docs
- [ ] Prometheus picked up user demo ServiceMonitor and PrometheusRule
- [ ] Access Alertmanager following the application developer docs
- [ ] Alertmanager
Watchdog
firing
Logs
[!note] As platform administrator
- [ ] Can login to OpenSearch Dashboards via Dex with IdP
- [ ] Indices created (authlog, kubeaudit, kubernetes, other)
- [ ] Indices managed (authlog, kubeaudit, kubernetes, other)
- [ ] Logs available (authlog, kubeaudit, kubernetes, other)
- [ ] Snapshots configured
[!note] As application developer
[email protected]
- [ ] Can login to OpenSearch Dashboards via Dex with static user
- [ ] Welcome dashboard presented first
- [ ] Logs available (kubeaudit, kubernetes)
- [ ] CISO dashboards available and working
Falco
[!note] As platform administrator
-
[ ] Deploy the falcosecurity/event-generator to generate events in wc
Commands
# Install kubectl create namespace event-generator kubectl label namespace event-generator owner=operator helm repo add falcosecurity https://falcosecurity.github.io/charts helm repo update helm -n event-generator install event-generator falcosecurity/event-generator \ --set securityContext.runAsNonRoot=true \ --set securityContext.runAsGroup=65534 \ --set securityContext.runAsUser=65534 \ --set podSecurityContext.fsGroup=65534 \ --set config.actions="" # Uninstall helm -n event-generator uninstall event-generator kubectl delete namespace event-generator
-
[ ] Logs are available in OpenSearch Dashboards
-
[ ] Logs are relevant
Network policies
- [ ] No dropped packets in NetworkPolicy Grafana dashboard
Infrastructure tests
- [ ] Able to run
terraform plan
without changes - [ ] Able to add nodes without issues
- [ ] Able to remove nodes without issues
# Post-QA steps
- [x] Complete the code freeze step
- [x] Complete all post-QA steps in the internal checklist
# Release steps
- [x] Complete the release step
- [ ] Complete the update public release notes step
- [ ] Complete the update the main branch step
- [ ] Complete all post release steps in the internal checklist