vagrant-projects icon indicating copy to clipboard operation
vagrant-projects copied to clipboard

OLCNE Install Failing because of cri-o service not starting on master node

Open rikumabokss opened this issue 3 years ago • 3 comments

While setting up the OLCNE env with istio enabled ,setup is getting failed because of crio service failing on master node..

master1: ===== Create the Oracle Linux Cloud Native Environment: olcne-env ===== master1: olcnectl --api-server 127.0.0.1:8091 environment create --environment-name olcne-env --secret-manager-type file --olcne-node-cert-path /etc/olcne/pki/production/node.cert --olcne-ca-path /etc/olcne/pki/production/ca.cert --olcne-node-key-path /etc/olcne/pki/production/node.key --update-config master1: ===== Create the Kubernetes module for olcne-env ===== master1: olcnectl module create --environment-name olcne-env --selinux enforcing --module kubernetes --name olcne-cluster --container-registry 192.168.99.10:5000/olcne --nginx-image 192.168.99.10:5000/olcne/nginx:1.17.7 --pod-network-iface eth1 --master-nodes 192.168.99.101:8090 --worker-nodes 192.168.99.111:8090,192.168.99.112:8090 --restrict-service-externalip-ca-cert=/etc/olcne/pki-externalip-validation-webhook/production/ca.cert --restrict-service-externalip-tls-cert=/etc/olcne/pki-externalip-validation-webhook/production/node.cert --restrict-service-externalip-tls-key=/etc/olcne/pki-externalip-validation-webhook/production/node.key master1: ===== Validate all required prerequisites are met for the Kubernetes module ===== master1: olcnectl module validate --environment-name olcne-env --name olcne-cluster master1: ===== Deploy the Kubernetes module into olcne-env (Be patient!) ===== master1: olcnectl module install --environment-name olcne-env --name olcne-cluster master1: Returned a non-zero code: 1 master1: Last output lines: master1: Encountered errors with 192.168.99.101:8090 master1: Job for crio.service failed because the control process exited with error code. master1: See "systemctl status crio.service" and "journalctl -xe" for details. master1: See /var/tmp/cmd_IyzTX.log for details The SSH command responded with a non-zero exit status. Vagrant assumes that this means the command failed. The output for this command should be in the log above. Please read the output to determine what went wrong.

=============================== On master Node: [root@master1 ~]# systemctl start crio.service Job for crio.service failed because the control process exited with error code. See "systemctl status crio.service" and "journalctl -xe" for details. [root@master1 ~]# systemctl status crio.service ● crio.service - Container Runtime Interface for OCI (CRI-O) Loaded: loaded (/usr/lib/systemd/system/crio.service; disabled; vendor preset: disabled) Active: failed (Result: exit-code) since Mon 2021-11-15 05:00:34 UTC; 7s ago Docs: https://github.com/cri-o/cri-o Process: 14231 ExecStart=/usr/bin/crio $CRIO_CONFIG_OPTIONS $CRIO_RUNTIME_OPTIONS $CRIO_STORAGE_OPTIONS $CRIO_NETWORK_OPTIONS $CRIO_METRICS_OPTIONS> Main PID: 14231 (code=exited, status=1/FAILURE)

Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.734356924Z" level=info msg="Not using native diff for overlay, this may cau> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.734461475Z" level=info msg="Using default capabilities: CAP_CHOWN, CAP_DAC_> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.772486524Z" level=info msg="Using conmon executable: /usr/libexec/crio/conm> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.775136499Z" level=info msg="Conmon does support the --sync option" Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.775253626Z" level=info msg="No seccomp profile specified, using the interna> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.775271567Z" level=info msg="AppArmor is disabled by the system or at CRI-O > Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.775300995Z" level=fatal msg="Validating runtime config: cgroupfs manager co> Nov 15 05:00:34 master1.vagrant.vm systemd[1]: crio.service: Main process exited, code=exited, status=1/FAILURE Nov 15 05:00:34 master1.vagrant.vm systemd[1]: crio.service: Failed with result 'exit-code'. Nov 15 05:00:34 master1.vagrant.vm systemd[1]: Failed to start Container Runtime Interface for OCI (CRI-O).

[root@master1 ~]# journalctl -u crio -- Logs begin at Mon 2021-11-15 04:25:41 UTC, end at Mon 2021-11-15 05:00:34 UTC. -- Nov 15 04:31:53 master1.vagrant.vm systemd[1]: Starting Container Runtime Interface for OCI (CRI-O)... Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15T04:31:53Z" level=info msg="Starting CRI-O, version: 1.20.2, git: a03f99eb8ad5ded1294> Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15T04:31:53Z" level=warning msg="The 'registries' option in crio.conf(5) (referenced in> Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15T04:31:53Z" level=warning msg="Please refer to containers-registries.conf(5) for conf> Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.170508386Z" level=info msg="Node configuration value for hugetlb cgroup is > Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.170524553Z" level=info msg="Node configuration value for pid cgroup is true" Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.170598694Z" level=info msg="Node configuration value for memoryswap cgroup > Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.178414027Z" level=info msg="Node configuration value for systemd CollectMod> Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.179168684Z" level=info msg="Not using native diff for overlay, this may cau> Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.179279456Z" level=info msg="Using default capabilities: CAP_CHOWN, CAP_DAC_> Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.216102619Z" level=info msg="Using conmon executable: /usr/libexec/crio/conm> Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.254235293Z" level=info msg="Conmon does support the --sync option" Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.254447433Z" level=info msg="No seccomp profile specified, using the interna> Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.254462868Z" level=info msg="AppArmor is disabled by the system or at CRI-O > Nov 15 04:31:53 master1.vagrant.vm crio[13842]: time="2021-11-15 04:31:53.254487341Z" level=fatal msg="Validating runtime config: cgroupfs manager co> Nov 15 04:31:53 master1.vagrant.vm systemd[1]: crio.service: Main process exited, code=exited, status=1/FAILURE Nov 15 04:31:53 master1.vagrant.vm systemd[1]: crio.service: Failed with result 'exit-code'. Nov 15 04:31:53 master1.vagrant.vm systemd[1]: Failed to start Container Runtime Interface for OCI (CRI-O). Nov 15 04:34:24 master1.vagrant.vm systemd[1]: Starting Container Runtime Interface for OCI (CRI-O)... Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15T04:34:24Z" level=info msg="Starting CRI-O, version: 1.20.2, git: a03f99eb8ad5ded1294> Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15T04:34:24Z" level=warning msg="The 'registries' option in crio.conf(5) (referenced in> Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15T04:34:24Z" level=warning msg="Please refer to containers-registries.conf(5) for conf> Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.816122737Z" level=info msg="Node configuration value for hugetlb cgroup is > Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.816142798Z" level=info msg="Node configuration value for pid cgroup is true" Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.816258479Z" level=info msg="Node configuration value for memoryswap cgroup > Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.823616376Z" level=info msg="Node configuration value for systemd CollectMod> Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.824268655Z" level=info msg="Not using native diff for overlay, this may cau> Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.824499558Z" level=info msg="Using default capabilities: CAP_CHOWN, CAP_DAC_> Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.852709202Z" level=info msg="Using conmon executable: /usr/libexec/crio/conm> Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.855489426Z" level=info msg="Conmon does support the --sync option" Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.855653421Z" level=info msg="No seccomp profile specified, using the interna> Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.855666277Z" level=info msg="AppArmor is disabled by the system or at CRI-O > Nov 15 04:34:24 master1.vagrant.vm crio[13955]: time="2021-11-15 04:34:24.855695926Z" level=fatal msg="Validating runtime config: cgroupfs manager co> ...skipping... Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15T04:53:50Z" level=warning msg="The 'registries' option in crio.conf(5) (referenced in> Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15T04:53:50Z" level=warning msg="Please refer to containers-registries.conf(5) for conf> Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.462058305Z" level=info msg="Node configuration value for hugetlb cgroup is > Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.462069407Z" level=info msg="Node configuration value for pid cgroup is true" Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.462102290Z" level=info msg="Node configuration value for memoryswap cgroup > Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.469112039Z" level=info msg="Node configuration value for systemd CollectMod> Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.469586626Z" level=info msg="Not using native diff for overlay, this may cau> Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.469699936Z" level=info msg="Using default capabilities: CAP_CHOWN, CAP_DAC_> Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.496585423Z" level=info msg="Using conmon executable: /usr/libexec/crio/conm> Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.499380977Z" level=info msg="Conmon does support the --sync option" Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.499441191Z" level=info msg="No seccomp profile specified, using the interna> Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.499448482Z" level=info msg="AppArmor is disabled by the system or at CRI-O > Nov 15 04:53:50 master1.vagrant.vm crio[14045]: time="2021-11-15 04:53:50.499470462Z" level=fatal msg="Validating runtime config: cgroupfs manager co> Nov 15 04:53:50 master1.vagrant.vm systemd[1]: crio.service: Main process exited, code=exited, status=1/FAILURE Nov 15 04:53:50 master1.vagrant.vm systemd[1]: crio.service: Failed with result 'exit-code'. Nov 15 04:53:50 master1.vagrant.vm systemd[1]: Failed to start Container Runtime Interface for OCI (CRI-O). Nov 15 05:00:34 master1.vagrant.vm systemd[1]: Starting Container Runtime Interface for OCI (CRI-O)... Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15T05:00:34Z" level=info msg="Starting CRI-O, version: 1.20.2, git: a03f99eb8ad5ded1294> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15T05:00:34Z" level=warning msg="The 'registries' option in crio.conf(5) (referenced in> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15T05:00:34Z" level=warning msg="Please refer to containers-registries.conf(5) for conf> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.725954892Z" level=info msg="Node configuration value for hugetlb cgroup is > Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.725966846Z" level=info msg="Node configuration value for pid cgroup is true" Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.726031821Z" level=info msg="Node configuration value for memoryswap cgroup > Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.733591310Z" level=info msg="Node configuration value for systemd CollectMod> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.734356924Z" level=info msg="Not using native diff for overlay, this may cau> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.734461475Z" level=info msg="Using default capabilities: CAP_CHOWN, CAP_DAC_> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.772486524Z" level=info msg="Using conmon executable: /usr/libexec/crio/conm> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.775136499Z" level=info msg="Conmon does support the --sync option" Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.775253626Z" level=info msg="No seccomp profile specified, using the interna> Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.775271567Z" level=info msg="AppArmor is disabled by the system or at CRI-O > Nov 15 05:00:34 master1.vagrant.vm crio[14231]: time="2021-11-15 05:00:34.775300995Z" level=fatal msg="Validating runtime config: cgroupfs manager co> Nov 15 05:00:34 master1.vagrant.vm systemd[1]: crio.service: Main process exited, code=exited, status=1/FAILURE Nov 15 05:00:34 master1.vagrant.vm systemd[1]: crio.service: Failed with result 'exit-code'. Nov 15 05:00:34 master1.vagrant.vm systemd[1]: Failed to start Container Runtime Interface for OCI (CRI-O).

rikumabokss avatar Nov 15 '21 05:11 rikumabokss

I don't have exactly in vagrant but master node falls back in the same error: nov 23 17:42:07 kubectrlplane.kubelab.io crio[1388]: time="2021-11-23 17:42:07.036242372-06:00" level=info msg="AppArmor is disabled by the system or at CRI-O build-time" file="apparmor/apparmor.go:33" nov 23 17:42:07 kubectrlplane.kubelab.io crio[1388]: time="2021-11-23 17:42:07.036289275-06:00" level=fatal msg="Validating runtime config: cgroupfs manager conmon cgroup should be 'pod' or empty" file="crio/main.go:334" nov 23 17:42:07 kubectrlplane.kubelab.io systemd[1]: crio.service: main process exited, code=exited, status=1/FAILURE nov 23 17:42:07 kubectrlplane.kubelab.io systemd[1]: Failed to start Container Runtime Interface for OCI (CRI-O). nov 23 17:42:07 kubectrlplane.kubelab.io systemd[1]: Unit crio.service entered failed state. nov 23 17:42:07 kubectrlplane.kubelab.io systemd[1]: crio.service failed.

ghost avatar Nov 23 '21 23:11 ghost

@rikumabokss / @luisortiz-grit - is this still an issue? The provisioning script has been upgraded to use OLCNE v1.4.

Kindly give it a try and update the issue. Thank you!

hussam-qasem avatar Apr 12 '22 16:04 hussam-qasem

Any further feedback on this issue ? Did you get it sorted out ?

scoter-oracle avatar Apr 28 '22 15:04 scoter-oracle

No further feedback, closing for now.

scoter-oracle avatar Feb 09 '23 22:02 scoter-oracle