terraform-aws-eks-jx
terraform-aws-eks-jx copied to clipboard
Unable to deploy Jenkins-x 3 to EKS using Terraform modules
Summary
I am new to Jenkins-x and having trouble with the installation on EKS using Secrets Manager with Terraform installation approach. Seeing these errors. Any help is really appreciated. I did make a dummy commit to our internal bitbucket cluster repo(jx3-eks-asm) and still seeing same errors, I believe it's because the boot pods are failing as well. Here are the errors from health status command.
There were bunch of other issues too but was able to resolve by making local changes to kuberhealthy and other helm chart versions. I have verified that we have terraform, aws-cli, aws-iam-authenticator, and wget as mentioned in the prerequisites section. Noticed that none of the health commands work if kuberhealthy is disabled and it appears that there are bunch of issues with that Helm chart and CRDs installation. Its using an old version of the chart, so I updated to 0.0.90 or similar to fix another issue but now seeing this.
Error from one of boot pods in the jx-git-operator ns:
error validating "config-root/namespaces/jx/jx-kh-check-health-checks-jx/jx-bot-token-kuberhealthycheck.yaml": error validating data: ValidationError(KuberhealthyCheck.spec.podSpec.containers[0]): unknown field "restartPolicy" in io.github.comcast.v1.KuberhealthyCheck.spec.podSpec.containers; if you choose to ignore these errors, turn validation off with --validate=false
error validating "config-root/namespaces/jx/jx-kh-check-health-checks-jx/jx-webhook-events-kuberhealthycheck.yaml": error validating data: ValidationError(KuberhealthyCheck.spec.podSpec.containers[0]): unknown field "restartPolicy" in io.github.comcast.v1.KuberhealthyCheck.spec.podSpec.containers; if you choose to ignore these errors, turn validation off with --validate=false
error validating "config-root/namespaces/jx/jx-kh-check-health-checks-jx/jx-webhook-kuberhealthycheck.yaml": error validating data: ValidationError(KuberhealthyCheck.spec.podSpec.containers[0]): unknown field "restartPolicy" in io.github.comcast.v1.KuberhealthyCheck.spec.podSpec.containers; if you choose to ignore these errors, turn validation off with --validate=false
make[1]: Leaving directory '/workspace/source'
make[1]: *** [versionStream/src/Makefile.mk:324: kubectl-apply] Error 1
error: failed to regenerate: failed to regenerate phase 1: failed to run 'make regen-phase-1 NEW_CLUSTER=false' command in directory '.', output: ''
make: *** [versionStream/src/Makefile.mk:269: regen-check] Error 1
kubectl version
Client Version: version.Info{Major:"1", Minor:"23+", GitVersion:"v1.23.13-eks-fb459a0", GitCommit:"55bd5d5cb7d32bc35e4e050f536181196fb8c6f7", GitTreeState:"clean", BuildDate:"2022-10-24T20:38:50Z", GoVersion:"go1.17.13", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"24+", GitVersion:"v1.24.13-eks-0a21954", GitCommit:"6305d65c340554ad8b4d7a5f21391c9fa34932cb", GitTreeState:"clean", BuildDate:"2023-04-15T00:33:45Z", GoVersion:"go1.19.8", Compiler:"gc", Platform:"linux/amd64"}
jx health status -A
NAME NAMESPACE STATUSERROR MESSAGE
daemonset kuberhealthy OK
deployment kuberhealthy OK
dns-status-internal kuberhealthy OK
jx-install jx-git-operator ERRORlatest boot job jx-boot-1997a686-37fb-4704-af81-8505a7518877 has been running for more than 30m0s, it could be stuck
latest boot job jx-boot-1997a686-37fb-4704-af81-8505a7518877 has a failed run
jx-pod-status kuberhealthy ERRORpod: jenkins-x-chartmuseum-6cd76fd747-4hwrc in namespace: jx is in pod status phase Pending
pod: jx-build-controller-b6bdc5b6d-ch2xh in namespace: jx is in pod status phase Pending
pod: jx-pipelines-visualizer-7b685f9c79-bdx5k in namespace: jx is in pod status phase Pending
pod: lighthouse-foghorn-cff4fd4f5-r76kr in namespace: jx is in pod status phase Pending
pod: lighthouse-keeper-57dfddb5f9-plfjn in namespace: jx is in pod status phase Pending
pod: lighthouse-webhooks-85f985b664-9wdgx in namespace: jx is in pod status phase Pending
pod: nexus-nexus-776c96f565-vxp42 in namespace: jx is in pod status phase Pending
jx-secrets kuberhealthy ERRORERROR, Secrets Manager can't find the specified secret.
ERROR, Secrets Manager can't find the specified secret.
ERROR, Secrets Manager can't find the specified secret.
ERROR, Secrets Manager can't find the specified secret.
ERROR, Secrets Manager can't find the specified secret.
ERROR, Secrets Manager can't find the specified secret.
ERROR, Secrets Manager can't find the specified secret.
network-connection-check kuberhealthy OK
pod-restarts kuberhealthy ERRORFound: 2919 `BackOff` events for pod: ingress-nginx-controller-5bc8458b4c-4g8ht in namespace: nginx
Found: 2901 `BackOff` events for pod: ingress-nginx-controller-5bc8458b4c-84pnz in namespace: nginx
Found: 2906 `BackOff` events for pod: ingress-nginx-controller-5bc8458b4c-qxjs2 in namespace: nginx
jx secret verify
SECRET STATUS
jx-production/tekton-container-registry-auth key tekton-container-registry-auth missing properties: .dockerconfigjson
jx-staging/tekton-container-registry-auth key tekton-container-registry-auth missing properties: .dockerconfigjson
jx/jenkins-maven-settings key jx-maven-settings missing properties: settingsXml, securityXml
jx/jenkins-x-chartmuseum valid: jx-admin-user/BASIC_AUTH_PASS, jx-admin-user/BASIC_AUTH_USER
jx/jx-basic-auth-htpasswd key jx-basic-auth-htpasswd missing properties: token
jx/jx-basic-auth-user-password valid: jx-basic-auth-user/password, jx-basic-auth-user/username
jx/lighthouse-oauth-token key lighthouse-oauth missing properties: token
jx/nexus valid: jx-admin-user/password
jx/tekton-container-registry-auth key tekton-container-registry-auth missing properties: .dockerconfigjson
jx/tekton-git key jx-pipeline-user missing properties: token, username
(edited)
Steps to reproduce the behavior
Install Jenkins-x using latest Terraform module on EKS Configured Kubernetes cluster version to 1.24 Use AWS Secrets Manager Clone jx3-gitops-repositories/jx3-terraform-eks to a bitbucket local repo and push changes to internal remote bitbucket cloud repo Clone https://github.com/jx3-gitops-repositories/jx3-eks-asm repo to update jx-requirments.yaml to point to bitbucket cloud URL and other information as per https://jenkins-x.io/v3/admin/setup/config/git/#bitbucket-cloud
Expected behavior
jx boot pods and gitops operator pods start normal but seeing errors with healthcheck related pods.
Actual behavior
Terraform version
terraform version Terraform v1.4.5 on linux_amd64
main.tf content:
module "eks-jx" {
source = "jenkins-x/eks-jx/aws"
version = "1.21.2"
cluster_version = 1.24
cluster_name = jx3-test
region = us-east-2
vault_user = ""
is_jx2 = false
jx_git_url = "[bitbucket.org](https://bitbucket.org/myorg/jx3-eks-asm.git)"
jx_bot_username = "bitbucket-repo-user"
jx_bot_token = "bitbucket-repo-pass"
force_destroy = true
nginx_chart_version = "3.12.0"
install_kuberhealthy = true
create_eks = true
create_vpc = true
create_autoscaler_role = false
create_cmcainjector_role = false
enable_reports_storage = true
enable_repository_storage = true
use_vault = false
create_cm_role = false
create_exdns_role = true
enable_logs_storage = true
apex_domain = var.apex_domain
subdomain = jx3-test
create_and_configure_subdomain = true
manage_apex_domain = false
manage_subdomain = true
enable_external_dns = true
create_nginx_namespace = true
create_nginx = true
vpc_name = "jenkins-x-vpc"
create_asm_role = true
use_asm = true
}
Module version
https://registry.terraform.io/modules/jenkins-x/eks-jx/aws/latest
Operating system
Linux