OpenCue icon indicating copy to clipboard operation
OpenCue copied to clipboard

Too many uninformative loglines

Open donalm opened this issue 3 years ago • 1 comments

The code gets a listing of pids from the proc directory: https://github.com/AcademySoftwareFoundation/OpenCue/blob/e7c38c6bf5a39eec650a1f7a3851dabcc10d6f48/rqd/rqd/rqmachine.py#L246-L247

Then reads their stat and statm files, and also creates a psutil.Process instance for each PID. This all takes a non-zero amount of time, and because so many Linux processes are ephemeral and live only for a few ms, we'll very frequently have one or more PIDs whose /proc/PID directory no longer exists by the time we look for it.

That means that this line: https://github.com/AcademySoftwareFoundation/OpenCue/blob/e7c38c6bf5a39eec650a1f7a3851dabcc10d6f48/rqd/rqd/rqmachine.py#L285-L286

..creates error level loglines for a high proportion of times that this method runs, which is normally every few seconds. Even in the case where a process we care about has exited, it is probably not necessary to log an exception about that. It's normal behaviour. Logging exceptions for other processes does not seem like it would ever be helpful.

It is unlikely that reading from a PID directory would fail for any other reason, because it's not a real directory - just an in-memory data structure that presents a filesystem API, so most of the problems that might normally be associated with reading files from a filesystem should not occur in this instance.

IMHO: it's safe to remove this logline, or at least dial it back to a log.debug

donalm avatar Sep 14 '22 12:09 donalm

I created https://github.com/AcademySoftwareFoundation/OpenCue/pull/1191 to reduce the log level for now.

bcipriano avatar Sep 14 '22 18:09 bcipriano