kazan
kazan copied to clipboard
Error handling strategy
I am looking for some ideas on handling errors in a user-friendly manner. My main requirement is to distinguish transient errors (which may be recoverable) and hard errors (which require user action). Here is an example. I was watching a k8s cluster and the secrets were changed. My code got a bad_match error (shown below) at https://github.com/obmarg/kazan/blob/master/lib/kazan/watcher.ex#L135.
{:error, {{:badmatch, {:error, {:http_error, 401, %{ "apiVersion" => "v1", "code" => 401, "kind" => "Status", "message" => "Unauthorized", "metadata" => %{}, "reason" => "Unauthorized", "status" => "Failure" }}}}, [ {Kazan.Watcher, :init, 1, [file: 'lib/kazan/watcher.ex', line: 135]}, {:gen_server, :init_it, 2, [file: 'gen_server.erl', line: 374]}, {:gen_server, :init_it, 6, [file: 'gen_server.erl', line: 342]}, {:proc_lib, :init_p_do_apply, 3, [file: 'proc_lib.erl', line: 249]} ]}}
I need to get at the root cause of the error to know if this is a recoverable error or not. I would prefer to not unwrap the error which would be a brittle way to handle this. Could this be {Ok, _} or {Err, _} response instead?
@obmarg Any thoughts on this?
Hi @MonadicT - sorry for the delay getting back to you about this, the past month has been very busy at work and I've not had much spare time to focus on OS work.
I'd be happy to accept a PR that updates Watcher.init
to return an error in this case, which might be easier for you to pattern match when calling start_link
.
Another option might be to use a Supervisor to start the watcher? That way you could rely on Supervisor restarts to handle any errors - if the error is transient then it won't happen on restart, if it's not transient then eventually the Supervisor will give up and you can handle that at a higher level in the Supervision tree.
No worries. I do have a supervisor for the watcher right now but patching Watcher.init would be a better idea IMHO. I will send you a PR soon.