toniq icon indicating copy to clipboard operation
toniq copied to clipboard

How do I tie in an error tracker?

Open aaronjensen opened this issue 7 years ago • 7 comments

If a job crashes, I'd want to be notified via something like honeybadger or appsignal. Likewise if a job fails ultimately (no more retries).

Is there somewhere to hook into this, or will SASL error logging catch it?

aaronjensen avatar Nov 11 '16 02:11 aaronjensen

I still use my old https://github.com/joakimk/honeybadger which works, but that's far from ideal (and won't report errors with as much detail as the official client).

Toniq expects the trackers to report Logger.error, called from here: https://github.com/joakimk/toniq/blob/master/lib/toniq/job_runner.ex#L53

I think the official honeybadger client expects the entire thing to crash in order for it to report an error. That would require a bit of re-organizing of the code.

If someone wants to do that it would be great. I don't have the need yet myself.

Doing this should be fairly simple, make a tiny app, add the official honeybadger client and config for it, make a job fail. Ensure it reports correctly.

joakimk avatar Nov 19 '16 11:11 joakimk

Please try https://github.com/honeybadger-io/honeybadger-elixir with the use_logger option and report back if it works. If it does we could add that to the readme.

joakimk avatar Nov 19 '16 11:11 joakimk

@joakimk we're using appsignal which does not yet have this option. See https://github.com/appsignal/appsignal-elixir/issues/38#issuecomment-265232790.

Would you be open to adding hooks for failures so that I could forward the error to appsignal? I would think that that'd require less reorganization than allowing the entire thing to crash.

aaronjensen avatar Dec 06 '16 18:12 aaronjensen

Actually it looks like maybe I can tie into JobEvent. I'll give that a shot

aaronjensen avatar Dec 06 '16 19:12 aaronjensen

Here's what I ended up doing. It'd be nice if the failed job event sent the error and the stacktrace.

defmodule ToniqErrorReporter do
  @moduledoc """
  Subscribes to toniq jobs and reports errors to appsignal
  """

  use GenServer
  require Logger
  alias Toniq.JobEvent

  def start_link do
    {:ok, _pid} = GenServer.start_link(__MODULE__, [], name: __MODULE__)
  end

  def init(_) do
    send self, :subscribe
    {:ok, %{}}
  end

  def handle_info(:subscribe, state) do
    JobEvent.subscribe

    {:noreply, state}
  end

  def handle_info({:finished, _job}, state), do: {:noreply, state}
  def handle_info({:failed, %{id: id, worker: worker} = job}, state) do
    Toniq.failed_jobs
    |> Enum.find(&(&1.id == id))
    |> case do
         %{error: error} ->
           Appsignal.send_error(error, "Job Failed (#{inspect worker})", nil, job)
         nil ->
           Appsignal.send_error(
             %RuntimeError{message: "Unknown error, could not find job in Toniq.failed_jobs"},
             "Job Failed (#{inspect worker})",
             nil,
             job)
       end

    {:noreply, state}
  end
end

aaronjensen avatar Dec 06 '16 19:12 aaronjensen

Passing nil in as stacktrace is fine, as it will default to System.stacktrace(). In Erlang, the stacktrace is always the stacktrace of the last error caught.

arjan avatar Dec 06 '16 19:12 arjan

@arjan this happens in a different process, so I don't think we can get the original stacktrace from here.

aaronjensen avatar Dec 06 '16 20:12 aaronjensen

ok

arjan avatar Dec 06 '16 20:12 arjan