opentelemetry-erlang-contrib icon indicating copy to clipboard operation
opentelemetry-erlang-contrib copied to clipboard

A way to scrub telemetry_ecto events of sensitive data

Open lukad opened this issue 10 months ago • 8 comments

We recently noticed that ecto telemetry events emitted for postgres errors contains the full executed query including the actual values passed to queries or the actual values conflicting with constraints for example.

Here is a screenshot of an example: ERROR 23505 unique_violation

This is a problem because we don't want to leak sensitive data such as personally identifying information to whatever system consumes these events.

Describe the solution you'd like I think a builtin way to transform events, which allows library users to implement data scrubbing, would be a good addition to telemetry_ecto.

Our current workaround looks like this, we have our own module that attaches an event handler for repo events. The module has an event handler that forwards all events to OpentelemetryEcto.handle_event but scrubs errors of sensitive data.

defmodule MyApp.OpentelemetryEcto do
  def setup(event_prefix, config \\ []) do
    event = event_prefix ++ [:query]
    :telemetry.attach({__MODULE__, event}, event, &__MODULE__.handle_event/4, config)
  end

  def handle_event(event, measurements, %{result: {:error, error}} = data, config) do
    error = scrub_error(error)
    OpentelemetryEcto.handle_event(event, measurements, %{data | result: {:error, error}}, config)
  end

  def handle_event(event, measurements, data, config) do
    OpentelemetryEcto.handle_event(event, measurements, data, config)
  end

  defp scrub_error(%Postgrex.Error{} = error) do
    ...
  end
end

Ideally OpentelemetryEcto provides a way to specify a module or function that will be called for all handled events to transform them.

The most basic way to achieve this could be OpentelemetryEcto.setup([:my_app, :repo], transform: fn _even -> ... end).

Describe alternatives you've considered The workaround above works fine, but I believe it's worth having a builtin and documented way of doing this.

What do you think? I'd be happy to contribute a MR.

lukad avatar Jan 09 '25 13:01 lukad

What happens if you set redact: true on the schema field?

bryannaegele avatar Jan 09 '25 18:01 bryannaegele

We already have redact: true on all ecto schema fields we don't want to log. It has no effect here because the errors we're scrubbing are coming from postgrex which doesn't know anything about ecto schemas.

lukad avatar Jan 09 '25 20:01 lukad

You would normally want to scrub it with a processor: https://opentelemetry.io/docs/security/config-best-practices/#scrub-sensitive-data

danschultzer avatar Jan 13 '25 19:01 danschultzer

I see. I think there is value in scrubbing the data earlier like how I'm doing it because it allows me to scrub the %Postgrex.Error{} structs instead of having to do it in a much more unreliable string based way. If you think this should not be a direct concern of telemetry_ecto then please feel free to close the issue :)

lukad avatar Jan 15 '25 10:01 lukad

And since it gets turned into a string you can't even do it with that processor.

tsloughter avatar Jan 15 '25 15:01 tsloughter

Postgrex.Error would have to be flattened to attributes like postgrex.error.<...>.

tsloughter avatar Jan 15 '25 15:01 tsloughter

Collector isn't the right solution since not everyone passes things through one. I'll keep this in mind with the update

bryannaegele avatar Jan 16 '25 19:01 bryannaegele

@lukad could you raise this in https://github.com/elixir-ecto/postgrex? We can look at adding something here but feels like a bandaid.

bryannaegele avatar Jan 19 '25 05:01 bryannaegele