wazuh-kubernetes
wazuh-kubernetes copied to clipboard
wazuh-manager-master and worker pods are in crashloopbackoff state after following the local-env deployment
I have cloned wazuh-kubernetes repository, generated certificates and deployed it on my kubernetes cluster. Changed StorageClass to nfs-client, because i have deployed nfs-subdir-external-provisioner in the namespace for dynamic provisioning of persistent volumes. The wazuh pods are in crashloopbackoff state after deploying. When i commented out volumesmounts with name wazuh-manager-master and wazuh-manager-worker the pods are running and the API is connected to dashboard.
Hi @chasegame-alpha. The StorageClass provisioner should be compatible with dynamic provisioning in order to create the volumes. If it is not you will need to manually create a PV and PVC using a custom manifest. Here is an example:
Make sure to use the correct provisioner.
I'm having the same issue, also using nfs-subdir-external-provisioner as my storage provisioner.
I modified the envs/local-env/storage-class.yaml
patch:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: wazuh-storage
provisioner: cluster.local/nfs-subdir-external-provisioner
mountOptions:
- nfsvers=3
Deployed using kubectl apply -k envs/local-env/
, and the wazuh-indexer-0
, wazuh-manager-master-0
, and wazuh-manager-woker-0
pods end up in the CrashLoopBackOff
state.
I'm a kubernetes neophyte so not sure where to go from here.
Hi @chasegame-alpha. The StorageClass provisioner should be compatible with dynamic provisioning in order to create the volumes. If it is not you will need to manually create a PV and PVC using a custom manifest. Here is an example:
@teddytpc1 so I understand correctly, are you suggesting that the nfs-subdir-external-provisioner is not compatible with dynamic provisioning?
Seems adding the mountOptions
to the storage class patch allowed the pods to start, but now errors like these are showing in the logs for the wazuh-manager-master-0
pod, which eventually reverts to the CrashLoopBackOff
state:
[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 0-wazuh-init: executing...
/var/ossec/data_tmp/permanent/var/ossec/api/configuration/
The path /var/ossec/api/configuration is already mounted
/var/ossec/data_tmp/permanent/var/ossec/etc/
The path /var/ossec/etc is already mounted
/var/ossec/data_tmp/permanent/var/ossec/logs/
The path /var/ossec/logs is already mounted
/var/ossec/data_tmp/permanent/var/ossec/queue/
The path /var/ossec/queue is already mounted
/var/ossec/data_tmp/permanent/var/ossec/agentless/
The path /var/ossec/agentless is already mounted
/var/ossec/data_tmp/permanent/var/ossec/var/multigroups/
The path /var/ossec/var/multigroups is empty, skiped
/var/ossec/data_tmp/permanent/var/ossec/integrations/
The path /var/ossec/integrations is already mounted
/var/ossec/data_tmp/permanent/var/ossec/active-response/bin/
The path /var/ossec/active-response/bin is already mounted
/var/ossec/data_tmp/permanent/var/ossec/wodles/
The path /var/ossec/wodles is already mounted
/var/ossec/data_tmp/permanent/etc/filebeat/
The path /etc/filebeat is already mounted
Updating /var/ossec/etc/internal_options.conf
Updating /var/ossec/integrations/pagerduty
Updating /var/ossec/integrations/slack
Updating /var/ossec/integrations/slack.py
Updating /var/ossec/integrations/virustotal
Updating /var/ossec/integrations/virustotal.py
Updating /var/ossec/active-response/bin/default-firewall-drop
Updating /var/ossec/active-response/bin/disable-account
Updating /var/ossec/active-response/bin/firewalld-drop
Updating /var/ossec/active-response/bin/firewall-drop
Updating /var/ossec/active-response/bin/host-deny
Updating /var/ossec/active-response/bin/ip-customblock
Updating /var/ossec/active-response/bin/ipfw
Updating /var/ossec/active-response/bin/kaspersky.py
Updating /var/ossec/active-response/bin/kaspersky
Updating /var/ossec/active-response/bin/npf
Updating /var/ossec/active-response/bin/wazuh-slack
Updating /var/ossec/active-response/bin/pf
Updating /var/ossec/active-response/bin/restart-wazuh
Updating /var/ossec/active-response/bin/restart.sh
Updating /var/ossec/active-response/bin/route-null
Updating /var/ossec/agentless/sshlogin.exp
Updating /var/ossec/agentless/ssh_pixconfig_diff
Updating /var/ossec/agentless/ssh_asa-fwsmconfig_diff
Updating /var/ossec/agentless/ssh_integrity_check_bsd
Updating /var/ossec/agentless/main.exp
Updating /var/ossec/agentless/su.exp
Updating /var/ossec/agentless/ssh_integrity_check_linux
Updating /var/ossec/agentless/register_host.sh
Updating /var/ossec/agentless/ssh_generic_diff
Updating /var/ossec/agentless/ssh_foundry_diff
Updating /var/ossec/agentless/ssh_nopass.exp
Updating /var/ossec/agentless/ssh.exp
Updating /var/ossec/wodles/utils.py
Updating /var/ossec/wodles/aws/aws-s3
Updating /var/ossec/wodles/aws/aws-s3.py
Updating /var/ossec/wodles/azure/azure-logs
Updating /var/ossec/wodles/azure/azure-logs.py
Updating /var/ossec/wodles/docker/DockerListener
Updating /var/ossec/wodles/docker/DockerListener.py
Updating /var/ossec/wodles/gcloud/gcloud
Updating /var/ossec/wodles/gcloud/gcloud.py
Updating /var/ossec/wodles/gcloud/integration.py
Updating /var/ossec/wodles/gcloud/tools.py
find: '/proc/312/task/312/fd/5': No such file or directory
find: '/proc/312/task/312/fdinfo/5': No such file or directory
find: '/proc/312/fd/6': No such file or directory
find: '/proc/312/fdinfo/6': No such file or directory
find: '/proc/313/task/313/fd/5': No such file or directory
find: '/proc/313/task/313/fdinfo/5': No such file or directory
find: '/proc/313/fd/6': No such file or directory
find: '/proc/313/fdinfo/6': No such file or directory
Identified Wazuh configuration files to mount...
'/wazuh-config-mount/etc/ossec.conf' -> '/var/ossec/etc/ossec.conf'
'/wazuh-config-mount/etc/authd.pass' -> '/var/ossec/etc/authd.pass'
[cont-init.d] 0-wazuh-init: exited 0.
[cont-init.d] 1-config-filebeat: executing...
Customize Elasticsearch ouput IP
Configuring username.
Configuring password.
Configuring SSL verification mode.
Configuring Certificate Authorities.
Configuring SSL Certificate.
Configuring SSL Key.
[cont-init.d] 1-config-filebeat: exited 0.
[cont-init.d] 2-manager: executing...
Traceback (most recent call last):
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
self.dialect.do_execute(
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
cursor.execute(statement, parameters)
sqlite3.OperationalError: disk I/O error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/var/ossec/framework/scripts/create_user.py", line 72, in <module>
create_rbac_db()
File "/var/ossec/framework/python/lib/python3.9/site-packages/wazuh-4.4.0-py3.9.egg/wazuh/rbac/orm.py", line 2454, in create_rbac_db
_Base.metadata.create_all(_engine)
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/sql/schema.py", line 4315, in create_all
bind._run_visitor(
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2049, in _run_visitor
conn._run_visitor(visitorcallable, element, **kwargs)
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1618, in _run_visitor
visitorcallable(self.dialect, self, **kwargs).traverse_single(element)
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/sql/visitors.py", line 138, in traverse_single
return meth(obj, **kw)
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/sql/ddl.py", line 754, in visit_metadata
[t for t in tables if self._can_create_table(t)]
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/sql/ddl.py", line 754, in <listcomp>
[t for t in tables if self._can_create_table(t)]
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/sql/ddl.py", line 730, in _can_create_table
return not self.checkfirst or not self.dialect.has_table(
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/dialects/sqlite/base.py", line 1598, in has_table
info = self._get_table_pragma(
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/dialects/sqlite/base.py", line 2063, in _get_table_pragma
cursor = connection.execute(statement)
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 976, in execute
return self._execute_text(object_, multiparams, params)
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1143, in _execute_text
ret = self._execute_context(
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
self._handle_dbapi_exception(
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
util.raise_from_cause(sqlalchemy_exception, exc_info)
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
raise value.with_traceback(tb)
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1245, in _execute_context
self.dialect.do_execute(
File "/var/ossec/framework/python/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 581, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) disk I/O error
[SQL: PRAGMA main.table_info("roles_rules")]
(Background on this error at: http://sqlalche.me/e/e3q8)
The link on the last line doesn't offer any helpful or relevant information unfortunately.
FWIW, there's nothing wrong that I can see with my NFS server; other workloads are using the nfs-subdir-external-provisioner without issue.
I tried adding ReadWriteMany
to the access modes across the various manifests in case it would make a difference, but unfortunately it didn't help.
Perhaps somewhat related, I am using the smb csi driver and get the same error if I don't mount with the nobrl flag enabled. This might be related: https://stackoverflow.com/questions/7573301/sqlite3-nfs-mount-issue-with-locking-can-i-use-something-like-cifs-nobrl
though it appears to be an issue for sqlite to use network storage.. so there might need to be a better answer. you also may, or may not end up at the next gate:
Started wazuh-authd... wazuh-db did not start correctly. [cont-init.d] 2-manager: exited 1. [cont-init.d] done. [services.d] starting services starting Filebeat [services.d] done. 2023/07/28 02:37:11 wazuh-csyslogd: INFO: Remote syslog server not configured. Clean exit. 2023/07/28 02:37:11 wazuh-dbd: INFO: Database not configured. Clean exit. 2023/07/28 02:37:11 wazuh-integratord: INFO: Remote integrations not configured. Clean exit. 2023/07/28 02:37:12 wazuh-agentlessd: INFO: Not configured. Exiting. 2023/07/28 02:37:12 wazuh-authd: INFO: Started (pid: 472). 2023/07/28 02:37:12 wazuh-authd: INFO: Accepting connections on port 1515. Using password specified on file: etc/authd.pass 2023/07/28 02:37:12 wazuh-authd: INFO: Setting network timeout to 1.000000 sec. 2023/07/28 02:37:13 wazuh-authd: ERROR: Unable to bind to socket 'queue/sockets/auth': 'Operation not permitted'. Closing local server. 2023/07/28 02:37:13 wazuh-db: INFO: Started (pid: 486). 2023/07/28 02:37:13 wazuh-db: CRITICAL: Unable to bind to socket 'queue/db/wdb': 'Operation not permitted'. Closing local server. 2023-07-28T02:37:23.535Z INFO instance/beat.go:645 Home path: [/usr/share/filebeat] Config path: [/etc/filebeat] Data path: [/var/lib/filebeat] Logs path: [/var/log/filebeat] 2023-07-28T02:37:23.647Z INFO instance/beat.go:653 Beat ID: 928663aa-d423-46b7-880e-71303bab6676 2023-07-28T02:37:23.655Z INFO [seccomp] seccomp/seccomp.go:124 Syscall filter successfully installed 2023-07-28T02:37:23.655Z INFO [beat] instance/beat.go:981 Beat info {"system_info": {"beat": {"path": {"config": "/etc/filebeat", "data": "/var/lib/filebeat", "home": "/usr/share/filebeat", "logs": "/var/log/filebeat"}, "type": "filebeat", "uuid": "928663aa-d423-46b7-880e-71303bab6676"}}} 2023-07-28T02:37:23.656Z INFO [beat] instance/beat.go:990 Build info {"system_info": {"build": {"commit": "aacf9ecd9c494aa0908f61fbca82c906b16562a8", "libbeat": "7.10.2", "time": "2021-01-12T22:10:33.000Z", "version": "7.10.2"}}} 2023-07-28T02:37:23.656Z INFO [beat] instance/beat.go:993 Go runtime info {"system_info": {"go": {"os":"linux","arch":"amd64","max_procs":4,"version":"go1.14.12"}}} 2023-07-28T02:37:23.657Z INFO [beat] instance/beat.go:997 Host info {"system_info": {"host": {"architecture":"x86_64","boot_time":"2023-05-06T20:40:55Z","containerized":false,"name":"wazuh-manager-master-0","ip":["127.0.0.1/8","::1/128","10.4.220.116/32","fe80::cca:a8ff:fe0e:d02c/64"],"kernel_version":"5.10.0-21-amd64","mac":["0e:ca:a8:0e:d0:2c"],"os":{"family":"debian","platform":"ubuntu","name":"Ubuntu","version":"20.04.6 LTS (Focal Fossa)","major":20,"minor":4,"patch":6,"codename":"focal"},"timezone":"UTC","timezone_offset_sec":0}}} 2023-07-28T02:37:23.709Z INFO [beat] instance/beat.go:1026 Process info {"system_info": {"process": {"capabilities": {"inheritable":null,"permitted":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"effective":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"bounding":["chown","dac_override","fowner","fsetid","kill","setgid","setuid","setpcap","net_bind_service","net_raw","sys_chroot","mknod","audit_write","setfcap"],"ambient":null}, "cwd": "/run/s6/services/filebeat", "exe": "/usr/share/filebeat/bin/filebeat", "name": "filebeat", "pid": 552, "ppid": 548, "seccomp": {"mode":"filter","no_new_privs":true}, "start_time": "2023-07-28T02:37:23.210Z"}}} 2023-07-28T02:37:23.710Z INFO instance/beat.go:299 Setup Beat: filebeat; Version: 7.10.2 2023-07-28T02:37:23.711Z INFO eslegclient/connection.go:99 elasticsearch url: https://wazuh-indexer-0.wazuh-indexer:9200 2023-07-28T02:37:23.712Z INFO [publisher] pipeline/module.go:113 Beat name: wazuh-manager-master-0 2023-07-28T02:37:23.715Z INFO beater/filebeat.go:117 Enabled modules/filesets: wazuh (alerts), () 2023-07-28T02:37:23.716Z INFO instance/beat.go:455 filebeat start running. 2023-07-28T02:37:23.884Z INFO memlog/store.go:119 Loading data file of '/var/lib/filebeat/registry/filebeat' succeeded. Active transaction id=0 2023-07-28T02:37:23.884Z INFO memlog/store.go:124 Finished loading transaction log file for '/var/lib/filebeat/registry/filebeat'. Active transaction id=0 2023-07-28T02:37:23.910Z INFO [registrar] registrar/registrar.go:109 States Loaded from registrar: 0 2023-07-28T02:37:23.910Z INFO [crawler] beater/crawler.go:71 Loading Inputs: 1 2023-07-28T02:37:23.911Z INFO log/input.go:157 Configured paths: [/var/ossec/logs/alerts/alerts.json] 2023-07-28T02:37:23.911Z INFO [crawler] beater/crawler.go:141 Starting input (ID: 9132358592892857476) 2023-07-28T02:37:23.911Z INFO [crawler] beater/crawler.go:108 Loading and starting Inputs completed. Enabled inputs: 1