shoehorn icon indicating copy to clipboard operation
shoehorn copied to clipboard

Replace error_logger handler with logger handler

Open LostKobrakai opened this issue 2 years ago • 12 comments

This is a first naive port based on the docs here: https://www.erlang.org/doc/apps/kernel/logger_chapter.html#example--implement-a-handler

I've opted for doing the pattern matching in the process generating the logs, so there's less message passing involved. I expect that to be fast enough to not slow down the application_controller.

I've not yet run tests with it, as I currently get this error on the ShoehornTest case:

Protocol 'inet_tcp': the name simple_app@Lillith seems to be in use by another Erlang node
--rpc-eval : RPC failed with reason :nodedown
…

Closes #79

LostKobrakai avatar Apr 08 '22 19:04 LostKobrakai

I just ran the tests:

➜  shoehorn git:(LostKobrakai-pr/migrate-to-logger) mix test
........

Finished in 9.7 seconds (0.07s async, 9.6s sync)
8 tests, 0 failures

Randomized with seed 841775

I'll need to look at the code in detail later, but I wanted to let you know that it passes the tests on my machine.

fhunleth avatar Apr 08 '22 19:04 fhunleth

Just got a bit of feedback on the erlang slack: Seems like started processes are indeed not really "supervised" in the new logger api. We could start the handler under the shoehorn supervisor, but it's a bit of a question how we want to fail in case this handler process crashes unexpectedly.

Also I'm wondering how concerned this code should be about overload. I kinda feel like applications starting/crashing at such rates is in itself already a bit of a problem.

LostKobrakai avatar Apr 08 '22 19:04 LostKobrakai

Hi @LostKobrakai,

I ran into the problem with catch OTP applications crashing today and revisited this PR. I also ran into the issue that the Elixir Logger globally turns off application start reports if you don't have it configured with handle_sasl_reports: true. This is due to it setting the Erlang logger's primary configuration to filter out sasl reports. I forgot whether I told you, but I've never used the Shoehorn feature to restart OTP applications when they crash. I'm not sure how many people use them. I think that you and your coworkers have been the only ones who have talked about it recently. All this is to say that I'm not sure this is worth the effort to fix, and perhaps the better solution is to promote alternative solutions. This probably means sunset Shoehorn completely and move the OTP release script code to :nerves.

fhunleth avatar Jan 22 '24 02:01 fhunleth

Tbh I don't think the restart feature is the most important – even though we do use it. Much more important is getting to know about issues and also not bricking the device / have it stay in a state where it at best can be remote debugged.

LostKobrakai avatar Jan 22 '24 19:01 LostKobrakai

Totally agree. Thanks for letting me know what parts are important to you.

fhunleth avatar Jan 23 '24 14:01 fhunleth

I ran into the problem with catch OTP applications crashing today and revisited this PR. I also ran into the issue that the Elixir Logger globally turns off application start reports if you don't have it configured with handle_sasl_reports: true. This is due to it setting the Erlang logger's primary configuration to filter out sasl reports.

Looking at this a bit more I don't actually think this matters. The :handle_sasl_reports config changes how the elixir logger translator works, but this MR does add a custom log handler, which should be directly feed by :logger. I'm still with you that this is likely a feature barely used, but I would expect it to still work.

LostKobrakai avatar Feb 07 '24 16:02 LostKobrakai

Did you actually try it? This didn't work when I tried it without handle_sasl_reports: true.

fhunleth avatar Feb 07 '24 16:02 fhunleth

Nope, I just looked at what elixir actually does with the setting and it certainly doesn't look like it would affect additional logger handlers. Though might also be different on earlier versions of elixir.

LostKobrakai avatar Feb 07 '24 16:02 LostKobrakai

Got it. I tried Elixir 1.16.0, I believe, so not too old.

fhunleth avatar Feb 07 '24 16:02 fhunleth

I believe it was one of these lines, since they specify :stop when :sasl is false:

https://github.com/elixir-lang/elixir/blob/main/lib/logger/lib/logger/utils.ex#L8-L9

fhunleth avatar Feb 07 '24 16:02 fhunleth

Yeah. Maybe I'm misunderstanding what this translator does. Maybe it does filter logs for all handlers, not just the primary one.

LostKobrakai avatar Feb 07 '24 16:02 LostKobrakai

Having had a look at it today: You're correct. Primary filters apply to all handlers not just the primary handler.

Logger calls the primary filters first, and if they all pass, it calls the handler filters for each handler. Logger calls the handler callback only if all filters attached to the handler in question also pass.

https://www.erlang.org/doc/apps/kernel/logger_chapter.html#filters

LostKobrakai avatar Jul 22 '24 20:07 LostKobrakai