sops
sops copied to clipboard
INI and DotEnv stores are not roundtrip-safe, and quoting is generally a problem for these formats
While looking at #784 I noticed that the situation for both the INI and DotEnv stores is a lot worse than I thought: since they both escape newlines as literal \ns, but do not escape backslashes themselves, they cannot distinguish newlines from \n in plaintext, and both end up as newlines after decrypting.
DotEnv has its own set of problems; there used to be a PR which improved the parser and emitter, see #622, but that had to be reverted since it introduced a breaking change - see #706. The discussions in these PRs, especially the latter, also show that changes to the store formats are dangerous and should be avoided if possible.
This brings up a question on how to handle this mess. The only real fix is to do a breaking change. There are three breaking behaviors:
- When loading an encrypted file written by an older version of sops. This can be handled since sops's version is included in the metadata (as
sops.version). I think this is the very basic thing that any fix must do. This will complicate our code, though. - When emitting plaintext DotEnv or INI files. This can happen while decrypting DotEnv and INI files, but also when decrypting other files (
--output-typeparameter set toiniordotenv). This is definitely a breaking change. - When encrypting a DotEnv or INI file which uses
\nin the input, or some other form of quoting that's suddenly supported.
Especially for 2. and 3. it is probably best if the behavior is configurable. The big question is how can it be configured. For encrypting, .sops.yaml could be used, or command line flags. For decryption, only command line flags and file metadata can be used.
What do folks think about this? I don't expect that we can find a good solution quickly, but we definitely have to start a more focussed discussion :)
CC @getsops/maintainers
A specific issue for INI quoting: #1597.
A relevant PR: #752.
The problem with DotEnv and INI is that there is no standardized file format of that name, and everyone has slightly different assumptions on how the format is defined. (This is even more true for INI than for DotEnv; it's just that for DotEnv the format that SOPS implements is rather far away from what everyone else is doing. I'm not saying this judgingly, I'm just stating how it is. We have to deal with this somehow.)
I have one proposal that would allow us to make improvements going on forward, at least on the unencrypted side: adding a "file format sub-ID" to the SOPS metadata. That way, when encrypting a DotEnv or INI file, the identifier for the format used at encryption time is written into the metadata. If the file is decrypted (and the same output store is used), the serializer is configured according to the identifier used (assuming the identifier is for the same store type). The default identifier for encrypted files not having one would be v1 (just an arbitrary pick, could also be an empty string). Then we can have ini-v2, which has some better quoting and maybe support for comments, and we can have dotenv-v2 which is closer to what everyone else is meaning with DotEnv files (i.e. allow proper escaping, allow proper quote usage, etc.). When encrypting a file, we can also allow the user to specify the file format (so they can stick to the old definitions, if they need that for compatibility reasons, for example for compatibility with older SOPS versions).
This leaves one problem: the encrypted files. These can contain some data unencrypted, which is not so great if the format isn't roundtrip safe.
Another idea for that: how about allowing in the first lines (comments or empty) to have a comment that identifies the file format options? Like # SOPS-format: ini-v2. This is easy to scan for and allows to determine how to interpret the remainder of the file. When writing an encrypted file, this would be emitted as the very first comment, so that in case there's another such comment coming from unencrypted source data, it will be ignored. When reading the encrypted file back in, that comment will be removed. Obviously this will be a problem when combined with older SOPS versions.
This is far from polished, but at least might provide some input for a discussion.