Lager logs errors when the standard error logger does not
Hello I have a supervisor that has a restart strategy of {one_for_all, 0, 1} and a child with a restart of permanent. With Lager errors get logged when the child goes down with {stop, normal, NewState} but the standard error logger doesn't report errors when I do this. Is this considered a bug or working as intended? Either way I don't want errors logged in this situation, is there something I can do to suppress these?
-Xavier
These are the errors being logged:
2013-06-16 15:59:09.939 [error] <0.73.0> Supervisor {<0.73.0>,player_sup} had child tcp_server started with tcp_server:start_link(#Port<0.3204>, #Fun<player_factory.2.121151626>, <0.75.0>) at <0.76.0> exit with reason normal in context child_terminated
2013-06-16 15:59:09.939 [error] <0.73.0> Supervisor {<0.73.0>,player_sup} had child tcp_server started with tcp_server:start_link(#Port<0.3204>, #Fun<player_factory.2.121151626>, <0.75.0>) at <0.76.0> exit with reason reached_max_restart_intensity in context shutdown
If you start SASL, does it log errors? I don't recall if the standard error_logger even knows about supervisor reports.
Yes starting with SASL logs errors. I've never seen these supervisor reports before because I never used SASL. This is what it looks like in SASL:
=SUPERVISOR REPORT==== 20-Jun-2013::19:05:43 === Supervisor: {<0.65.0>,player_sup} Context: child_terminated Reason: normal Offender: [{pid,<0.68.0>}, {name,tcp_server}, {mfargs,{tcp_server,start_link, [#Port<0.942>, #Fun<player_factory.2.121151626>, <0.67.0>]}}, {restart_type,permanent}, {shutdown,2000}, {child_type,worker}]
=SUPERVISOR REPORT==== 20-Jun-2013::19:05:43 === Supervisor: {<0.65.0>,player_sup} Context: shutdown Reason: reached_max_restart_intensity Offender: [{pid,<0.68.0>}, {name,tcp_server}, {mfargs,{tcp_server,start_link, [#Port<0.942>, #Fun<player_factory.2.121151626>, <0.67.0>]}}, {restart_type,permanent}, {shutdown,2000}, {child_type,worker}]
I guess the thing that threw me off is that Lager was indicating they were errors but my code is working as I intended and SASL doesn't indicate either way if it is an error it just shows this as additional supervisor information.
I don't know if you guys want to do anything with this. If I can though I would like to disable the supervisor reports if there is a way of configuring lager to do so? Because the error log is getting polluted with this kind of info which in my case is not an error.
I would also like to add that I do not believe setting up a supervisor with a restart strategy of {one_for_all, 0, 1} and a child with a restart of permanent should give me errors when it shuts down. I'm very intentionally telling Erlang/OTP if any of these children die kill all children and then the supervisor.
Yes, it isn't clear to me what to do with those kinds of messages. Normal and shutdown exit statuses should not log as errors, though.
My preference would be that they go in the debug category or something like it so they don't show up in info or error.
I have the same issue. A lot of false errors in the log.
had the same issue. But I think its not a bug with lager, its the way you start the child. If you set the restart type to "transient" (as recommended for workers you want to shut down "normal") you don't get a lager error and no SASL report.
If you use {one_for_all, 10, 1} and not {one_for_all, 0, 1} you would see that the worker would be restarted even with a "normal" exit in case you use the restart type "permanent".
So i think its ok to have a error msg from lager if a child terminates that should be "permanent".
See http://learnyousomeerlang.com/supervisors
A permanent process should always be restarted, no matter what. The supervisors we implemented in our previous applications used this strategy only. This is usually used by vital, long-living processes (or services) running on your node.
On the other hand, a temporary process is a process that should never be restarted. They are for short-lived workers that are expected to fail and which have few bits of code who depend on them.
Transient processes are a bit of an in-between. They're meant to run until they terminate normally and then they won't be restarted. However, if they die of abnormal causes (exit reason is anything but normal), they're going to be restarted. This restart option is often used for workers that need to succeed at their task, but won't be used after they do so.
Suggested fix: if the context is normal or shutdown suppress error (or route it as a debug/trace message.)