machinekit-hal icon indicating copy to clipboard operation
machinekit-hal copied to clipboard

Logging: multiple issues

Open zultron opened this issue 5 years ago • 2 comments

Logging has multiple issues that have been around for a long time. This issue attempts to collect them together to help develop a broad strategy.

  • syslog_async:

    • This is a 3rd-party distribution that shouldn't be bundled (multiple times: hal/lib/syslog_async.c and machinetalk/lib/syslog_async.c)
    • The "throttling" feature causes important error messages to be dropped during bursts of logging (e.g. when DEBUG=5 is set); I snuck in a patch to disable this in #315, commit 09c7c0ce
  • No unified system for logging

    • rtapi_support.c contains the rtapi_print_* infrastructure used by RTAPI to send messages through the rtapi_message_buffer
      • When that ring buffer isn't initialized (esp. during rtapi_msgd and rtapi_app start-up), messages fall back to using stderr and syslog_async; the last time I looked, ring buffer init was breaking in rtapi_app and the fall back system was used throughout rtapi_app lifetime (I may well have broken this myself in the past)
      • The rtapi_msg_handler feature that allows sending messages via custom functions is on the rtapi_app side of the ring buffer; this seems counter-intuitive, since even with a custom handler, any messages originating from rtapi_msgd will continue to be routed through syslog_async
    • In other places, syslog_async is called directly instead of routing messages through rtapi_print_*, especially under the machinetalk subdirectories
    • In my ROS integration, where rtapi_msgd and rtapi_app are started and stopped by a ROS node, we'd like to see these messages piped into the ROS logging system where they can be correlated with ROS logs from other nodes
  • Possible problems with rtapi_print_* messages and EMC Application messages being funneled into the same handler

    • In #199, I describe my fix that caused rtapi_print() to always print, as it should
    • This resulted in the debug messages getting sent to the EMC error channel, which shouldn't happen, since they are unrelated to the messages a CNC operator wants to see pop up on the Axis screen

zultron avatar Sep 12 '20 16:09 zultron

https://github.com/machinekit/EMCApplication/issues/9

the-snowwhite avatar Nov 05 '20 15:11 the-snowwhite

A new one:

During loadrt() and a comp's rtapi_app_main() execution, logging too much with e.g. hal_print_msg() apparently fills up some buffer and causes a hang. Once the ∅MQ RPC call times out (according to the timeout set in src/hal/utils/halcmd_rtapiapp.cc), the buffer is processed again, logs print, and rtapi_app_main() is able to complete. Although from the user's side, it appears that the loadrt() failed.

I haven't looked into it, but it's looking like the RPC call might be blocking other messaging activity.

zultron avatar Feb 05 '22 05:02 zultron