fleet
fleet copied to clipboard
Fleet stopped working after sometime. Seeing tons of "too many open files" error
Fleet version: (head to the "My account" page in the Fleet UI or run fleetctl --version)
fleet version 4.12.0
Operating system: (e.g. macOS 11.2.3) Debian GNU/Linux 10
Web browser: (e.g. Chrome 88.0.4324) N/A
🧑💻 Expected behavior
Fleet is not able to start.
💥 Actual behavior
Fleet is not able to start.
More info
Sep 12 01:03:35 n121-011-134 fleet[32592]: {"component":"http","err":"error writing result logs: writing log: timestamp: 2022-09-12T01:00:12Z: can't open new logfile: open /var/log/fleet/result.log: too many open files","ip_addr":"10.121.27.20","level":"error","method":"POST","took":"4.526744385s","ts":"2022-09-12T01:00:12.16973567Z","uri":"/api/v1/osquery/log","x_for_ip_addr":"10.121.27.20"}
Seeing tons of error like this. But unsure this is the root cause.
@daweizhang123 https://fleetdm.com/docs/deploying/faq#what-do-i-do-about-too-many-open-files-errors
@smaddock Thanks for the advice. I have tried that and now it's able to start
hm, sorry, the fleet went down again after the server running for a few minutes.
Server just crashed, didn't see any errors in logs.
My guess is one very resource-consuming scheduled query was running repeatedly. Even I deleted that scheduled query manually, fleet was still trying to execute them. So is there a recommended way to clear these cached queries?
Seeing errors in log like this. Is this the root cause?
Sep 12 18:28:44 n121-008-225 fleet[73464]: {"component":"http","err":"authentication error: find host: timestamp: 2022-09-12T18:23:45Z: context canceled","level":"info","path":"/api/v1/osquery/log","ts":"2022-09-12T18:23:45.819699397Z"}
root@n121-008-225:~# systemctl status fleet ● fleet.service - Fleet Loaded: loaded (/etc/systemd/system/fleet.service; enabled; vendor preset: enabled) Active: active (running) since Mon 2022-09-12 18:46:11 UTC; 4min 57s ago Main PID: 26381 (fleet) Tasks: 377 (limit: 39321) Memory: 799.1G CGroup: /system.slice/fleet.service
Very strange, the fleet is consuming 800G Memory. In the fleet, there is no scheduled query or ongoing query running.
You are running a quite old version of Fleet. Can you please upgrade to 4.19.1 or 4.20.0 and let us know whether the issue persists?
@daweizhang123 Were you able to update to the latest version of Fleet and if so are you still encountering the issue you originally described?
Closing this for now as we don't have the information we need to continue investigating. @daweizhang123 please comment and reopen if you upgrade and continue to see issues.