Wrapping sensitive data in Mint
According to EEF security guidance one of the best ways to protect sensitive data, such as security-related (tokens, passwords) or PII from accidental leakage is to wrap it in a closure. Other ways such as flagging the whole process as sensitive or post-processing stacktraces have significant downsides, limited effect and seem less optimal.
What if Mint allowed to pass some fields, such as headers, path (URLs may also contain sensitive info when we have to deal with 3rd party services outside of our control), etc in closures ? The closure could return {redacted, raw} pair (just an example, other wrappers could be used, but the raw value must be enclosed). Then
- Mint would use the raw value only when sending the request to the socket
- downstream libraries (Finch, Req and others) would obviously need to become aware of this feature, and use it accordingly
- logging/telemetry libraries would also need to adapt to it, and use redacted value everywhere where they previously used the raw value
What I also like about this approach is that all involved libraries don't need to know what kind of value it is and how to redact it. The caller provides both versions so no additional abstractions are needed.
Maybe something like this ?
%Mint.SensitiveValue{
redacted: "...",
raw: fn -> "..." end
}
Looks like we could even have it as generic abstraction in Elixir for all libraries dealing with sensitive data. But Mint could start using this tomorrow without waiting for an ecosystem-wide solution.
If anyone is interested, here’s a similar discussion in Req: https://github.com/wojtekmach/req/issues/461
Nice discussion thank you. I think we should do this in Mint if possible, and bubble it up to Finch/Req. I think the idea of a struct that
- Implements
String.Charsreturning the original wrapped value - Implements
Inspectand returns the redacted value
would already be a really good start? Then you'd just use it in place of strings:
Mint.HTTP.request(conn, "GET", "/user/:id", [{"Authorization", Mint.redact("Bearer foo")}])
Mint.redact("Foo")
#=> %Mint.RedactedString{value: "Foo"} (which implements protocols mentioned above)
Thoughts @wojtekmach @a3kov?
I don't know if I can explain it but my gut reaction is to either do it higher level, in Req, or lower level, in Elixir. This is obviously a separate conversation, I believe the latter would have maximum community leverage and interoperability on one hand but higher barrier to entry and way slower rollout on the other. I haven't thought that through but keep wondering about a System.get_env_secret or something like that, to mark the secret as such when it enters the system for the first time. Otherwise, it'd be awkward to use Mint.redact/1 directly in library like Swoosh or Sentry that supports other http clients, however I suppose they'd be commonly configured using secrets as plain values and internally could call Mint.redact before passing it down. On the other hand, if it's just on the higher level, say in Req, then the plain value when passed down the stack could sneak into logs, telemetry events, etc.
As another data point, Ecto redacts fields when inspected but otherwise keeps them as is. See https://hexdocs.pm/ecto/Ecto.Schema.html#module-redacting-fields. Perhaps it's done like that to avoid overhead but more likely to stay within "Ecto types" like :string, defined on the schema.
In any case, if Mint supports this, I'll be glad to help with Finch and Req support! I think a %Mint.RedactedString{} with protocol implementations is a very practical solution for this.
I think it's a practical solution that can evolve into a language feature eventually (Mint.redact would start returning the language struct instead of the Mint-specific one).
@whatyouhide Usually when you deal with a secret you would want it to be wrapped as soon as it enters your app - when you load it from the database, for example.. So in case of Mint you wouldn't want to unwrap then ask Mind to rewrap - you would rather pass the already wrapped value. That is, of course, assuming that all libraries wrap it in the same way - with closure returning the value only. I think we would want to support both passing the wrapped value and asking Mint to wrap it.
I agree that having this wrapper provided by Elixir would be the best way forward for the whole ecosystem. But adding it to the language and reaching the state where it's generally available could take a few years (not everyone can upgrade on a whim). And if a third-party library provided it, it would be much harder to reach wide adoption compared to "the official" solution from the language. If, OTOH, this feature is provided by Mint it can be adopted very fast, and serve as a temporary solution in the meantime.
Usually when you deal with a secret you would want it to be wrapped as soon as it enters your app - when you load it from the database, for example.. So in case of Mint you wouldn't want to unwrap then ask Mind to rewrap - you would rather pass the already wrapped value.
You could call Mint.redact/1 as soon as you read the secret though, no?
In my case, I store security token in a custom encrypted Ecto field, and the field load callback wraps the value in a closure. I could probably call Mint.redact in the load callback too as I have no use for the token other than to send it in a request. But there can be different situations. Maybe this unwrap-rewrap tango would be optimised by JIT anyway ?
@whatyouhide One more note, your suggested Mint.redact/1 function implies Mint knows how to redact the value. In the original post I suggested that the caller provides both raw and redacted value. This is important detail as Mint can't possibly know how to redact different types of values, and there are many different ways you could do that.
Perhaps as an optimization we could provide the option to store redact/1 function next to the wrapped raw value. This way we wouldn't need to waste memory on the redacted value all the time. Security tokens and passwords is one story, but there could be bigger sensitive payloads probably in some cases.
So we can offer both arguments - pre-redacted value or redact/1 function, and the caller can choose which one is more appropriate to the situation.
If we define a Mint.Redactable protocol we could implement it for strings, functions, and %Mint.SensitiveValue{} and then call it on basically everything we think could be sensitive—headers, payloads, URIs, whatever. This way you can naturally pass fn -> secret_value end to stuff. You could also use Mint.redact/1 still as described above. This is however based on Mint knowing how to redact a value, you're right about that.
We could fix that by accepting a function like
Mint.redact(value, redact_fn)
and store that function alongside the struct. The protocol could define redact(value, fun) and basically default the function to fn _value -> "<redacted>" end or something.
Thoughts?
I'm not sure about the protocol. If there's a struct for sensitive values, we already know that it must be redacted, and everything else should not be. With the struct approach the caller tells you what is sensitive by wrapping the value, and the struct doesn't care about the type of the value inside. With Mint.redact(value, redact_fn) the function would simply wrap it in the struct. Seems like the protocol is unnecessary step. What are you trying to achieve here ?
The redact/1 function is definitely the way to go. When a library simply replaces values with "[redacted]" is not good enough IMO.
If you have
# In the Mint module
def redact(value) do
fn -> value end
end
then you don't have the benefit of "custom redacting" that you talked about. The only thing you can do is get the value (fun.()) or inspect/print it as #Function<...>.
Let's say you have a request header. The value of the header parameter will contain either a ready to use value, or the "sensitive value" struct. In Mint you know what values can be sensitive (and it would be documented), and headers are in that list, so you pattern match on the sensitive value struct, with the struct containing the redact function. The redact function accepts the raw value which must be extracted by calling the closure - both enclosed raw value and the redact function are inside the struct. The caller can be a direct user of Mint, a downstream library, etc. so they call Mint to wrap the value (or wrap it manually) and pass the struct instead of the value.
%Mint.SensitiveValue{
redact_fun: fn _x -> "[redacted]" end,
value: fn -> "secretpassword" end
}
At least this is how it can work w.r.t. to requests. Now, if someone wanted to redact responses, I have no idea, as I haven't thought about that, and it may require a different approach.
A special case is iolist/iodata, where we might want to redact piece by piece instead of the whole binary. And since redaction may require the context (you can't redact half of a word, you need the whole word to detect it), maybe it will involve a special encoding.
Personally, I don't think the solution belongs to Mint. It should rather belong to the data structure holding the headers. Imposing something like Mint.redact we will also break the abstraction because now everyone upstream, Req, Tesla, Finch, need to know Mint exists. I also think any approach of wrapping the headers will be limited, because Mint will unwrap it at some point, which can then still leak.
Therefore given Mint does not hold the header values, all it needs to is to make sure it doesn't include headers and logs in exceptions, and then let upstream take care of filtering those when logging them as part of their data structures.
Personally, I don't think the solution belongs to Mint. It should rather belong to the data structure holding the headers.
But Mint owns this data since it has to send it over the wire. Below is TCP and it has 0 knowledge about HTTP. So it makes sense to use a wrapper owned by the end-consumer.
Imposing something like Mint.redact we will also break the abstraction because now everyone upstream, Req, Tesla, Finch, need to know Mint exists.
But they kinda do know. When you use any of these packages you are already aware of Mint anyway. It can be part of the contract.
Therefore given Mint does not hold the header values, all it needs to is to make sure it doesn't include headers and logs in exceptions, and then let upstream take care of filtering those when logging them as part of their data structures.
This is not only about headers. What if the URL or message body contains a sensitive value ? And this is not only about exceptions, it can also be very useful as a generic abstractions for logging/telemetry libraries.
But they kinda do know. When you use any of these packages you are already aware of Mint anyway.
They do it internally but they don't expose it to users of their own APIs. Some of them, such as Tesla, even explicitly support multiple adapters.
This is not only about headers. What if the URL or message body contains a sensitive value?
Mint receives this data but it doesn't really hold most of it. This is what Mint actually holds for HTTP 1:
https://github.com/elixir-mint/mint/blob/main/lib/mint/http1.ex#L86-L103
What you are actually asking is for Mint to wrap data it receives as argument, which feels backwards to me.
Of course, this is just my opinion, you can tread whatever road you prefer. But given your original assumption already imposes that "downstream libraries (Finch, Req and others) would obviously need to become aware of this feature, and use it accordingly", it is clear changes to Mint won't improve those libraries unless those libraries explicitly use whatever abstraction is defined here. Therefore I would solve it downstream first and then change Mint only to close remaining gaps (if any). Otherwise we risk adding abstractions that have no use downstream.
What you are actually asking is for Mint to wrap data it receives as argument, which feels backwards to me.
Actually I didn't ask Mint to wrap it, I only suggested a wrapper, and the wrapper must exist somewhere. If it's not Mint I don't mind other locations, but Mint could be the pioneer. For me it makes much more sense to have it in lowest level library, because otherwise you have to convince all high-level libraries to implement same abstraction together, good luck with that. If it's implemented in Mint they will have no other choice than to support it.
Otherwise we risk adding abstractions that have no use downstream.
That's why we need a generic abstraction in Elixir 😛 Because otherwise it may be impossible to reach wide adoption and everyone would just wait for others and point fingers.