transcrypt
transcrypt copied to clipboard
PROPOSAL: Add support for multiple passphrase "contexts" in one repo
Hi, thanks for your great work on transcrypt! We have found it really useful.
We are considering using transcrypt for a scenario it wasn't designed for, where we want to have multiple "contexts" of secret file within the one repository with a different passphrase for each one. Basically, we have some files in a repo that should be encrypted but available to user group A, and another set of files that should only be available to user group B.
This PR contains a very rough but working proof of concept with transcrypt updated to support this use case, where different security "contexts" can be specified by setting a CONTEXT
environment variable before running the script.
If you are at all interested in adding such a feature please let me know here, along with any thoughts about how you would prefer it to be built. We can work on improving the implementation if you think it might get included in the official version later?
The implementation here so far is a quick and dirty hack that duplicates all the git hook scripts for each named context, so that each one knows about its particular settings.
Despite being an awful hack it shows that the process works fairly well, with just those files for which the user has set the appropriate context & passphrase decrypted, and the rest left as they were.
Example usage:
- Choose a name for a non-default encryption "context" e.g. "super-secret"
- Init transcrypt in default and "super-secret" contexts:
$ transcrypt # follow prompts $ CONTEXT=-super-secret transcrypt # follow prompts, note leading dash (-)
- Set patterns in
.gitattributes
as usual for default decryption, and with the suffix "-super-secret" for that additional context:# Contents of .gitattributes secret.txt filter=crypt diff=crypt super-secret.txt filter=crypt-super-secret diff=crypt-super-secret
- Write your secret files,
git add
them, and commit. - Note the different passphrases for default and additional contexts
$ transcrypt --display $ CONTEXT=-super-secret transcrypt --display
- Clone repo elsewhere, and confirm that you can or cannot see file contents as you apply some combination of the init commands as printed by the display commands above.
Hey @jmurty! I'd certainly be open to coordinating with you to add this feature...you're actually not the first person to suggest it (see https://github.com/elasticdog/transcrypt/issues/25), so I'm sure it would be useful to others.
As far as the user experience, I wonder if using actual option flags to the script would make it more discoverable compared to introducing the new mechanism of environment variables. Using options would also allow us to add something like --list-contexts
, but I also like the idea of everything being backward compatible and using a default context without any extra effort for users that wouldn't need them. What do you think?
Note, I've recently pushed some automated reformatting commits (using shfmt), so it looks like you've got a few merge conflicts now with the current implementation.
Hi @elasticdog thanks for the feedback and the pointer to #25. Your quick summary there about required changes lined up pretty well with my experience: there were quite a few changes needed, all fairly small but spread out.
I agree that proper option flags is the right way to go. The code in this PR so far is full of quick and dirty hacks to show how multiple "contexts" could work, but definitely isn't how I think they should work. As such, I'm not worried about the merge conflicts as I'd start over anyway, working towards a cleaner and more robust implementation before I'd consider it a real candidate to be merged.
I will spell out in more detail below how I think it should work and some questions I have. If you could let me know if I'm way off track or what you think in general that would be great. I will aim to build a better implementation along these lines.
You should also be aware that I don't have a huge amount of bash scripting experience so will probably make some rookie mistakes along the way, keep an eye out for those.
Design sketch for multiple password "contexts"
Naming:
- I'm not sure that "contexts" is the best name for this feature, is there a better name that would more clearly express what it does? "Levels", "Groups", ... ?
Usage:
- ensure everything works as before for users who do not use extra contexts. Users shouldn't have to know or care about this feature unless they specifically need it
- support a new optional
--context [context-name]
option flag that can be combined with all relevant other options- the context name must comply with some rules so we can safely use it within settings files and scripts. Probably something like alpha-numeric characters only, case-insensitive (force to lower-case internally)
- when used alone or with
--cipher
and/or--password
will init a new encryption "context" with its own section in settings -
--context
would cause the script to target a specific context when combined with many other options:--display
,--rekey
,--flush-credentials
,--uninstall
,--list
,--export-gpg
,--import-gpg
- when multiple contexts are configured and the
--context
option is not provided, other options would process or print info about all existing contexts. E.g.--display
would list all contexts,--list
would list files for each context in turn, etc- this might be too ambitious? A simpler alternative would be to print a reminder to users that they have multiple contexts, e.g. if a user runs
transcript --rekey
the output might say something like:"Files in the default context have been rekeyed. Files in [2] other contexts were not affected"
- this might be too ambitious? A simpler alternative would be to print a reminder to users that they have multiple contexts, e.g. if a user runs
- add documentation about all of the above
Implementation:
- handle just the
--context
option at top of script to adjust various constants to be context-specific, similar to 1ab83c8b but using an option flag instead of environment variable - update helper scripts to recognise and handle non-default context names somehow, so unlike 1ab83c8b just one copy of each helper script is needed but it behaves differently for different contexts
- I'd probably re-use the same environment variables already set at the top of the script for this
- When writing the filter settings for a context, adjust the commands to pass through any context-specific data needed by the helper scripts
Hi @elasticdog I have made a start on the cleaned-up design I outlined above. I'd appreciate any feedback you have.
any updates?
Hi @acro5piano I can try to resurrect this PR and bring it up-to-date with the latest changes on master. Would you be able and willing to test the contexts feature if I can get it working again?
Yes sure. I would like to try the multiple passphrase feature. Thanks.
Hi @acro5piano I think I have gotten this PR branch into decent shape and a new, moderately comprehensive, unit test suite for contexts is now passing.
Can you please check out the PR branch and see if it works for you? See the basic instructions here for hints on how to use contexts.
Hi @acro5piano I have greatly cleaned up and improved the contexts implementation, so if you are testing please try this latest version. The setup instructions have changed, you now use a new crypt-context
attribute in .gitattributes to match patterns to a context. See the updated instructions.
Hi @elasticdog I have revisited this contexts work and greatly cleaned it up. I would appreciate any feedback or thoughts you have about it.
The commit trail is a mess, but if you check the final changes I don't think they are too disruptive.
Key points:
- extra contexts are defined by init-ing them with
transcrypt --context=my-context
and assigning corresponding file patterns in .gitattributes with thecrypt-context=my-context
attribute - you list your contexts, and see misconfiguration hints when appropriate, with
transcrypt --list-contexts
- list all crypt-ed files with
git ls-crypt
or just those in a specific context withgit ls-crypt-my-context
orgit ls-crypt-default
(default/no context) - the smudge/clean/textconv scripts now check for the
crypt-context
attribute, which might affect performance - the contexts features are reasonably covered by tests
- there are some unrelated cleanup changes to the merge script in this PR.
@jmurty Awesome work. I'll try it soon.
@jmurty Hi, I've tried your version and it's working precisely. Thank you for your fantastic work!
One question, what does ts
mean in the instructions?
I think more concrete name is comprehensive for those who use this feature for the first time, such as:
-
admin
-
superuser
-
staging
-
production
Hi @acro5piano that's great! Thanks for testing and confirming it works for you.
You're right that the ts
example name is bad, I only used it to try to squeeze the .gitattributes echo
command in the help text within 79 characters. I will try to find a better solution than using a short, but bad example name.
Hi @acro5piano I have improved the help text for contexts, along with a few bug fixes. Is this text better now? https://github.com/elasticdog/transcrypt/blob/b2db9284d1d079b83acb74e5890c54b01e8fe0bf/transcrypt#L1085-L1103
@jmurty Thank you for the improvement! Yes I think the new version is much comprehensive.
Is the plan to have this be merged after the 2.1.0 release work, or would it be a part of it?
@elasticdog Let's leave the contexts work to settle instead of rushing it into the 2.1.0 release, since I haven't used it at all myself day-to-day and I'm still finding rough edges and bugs.
I'll merge it to master
after the next release, and when it's a little more tested.
Is there a reason this never got merged? just new conflicts since last approved?
Oops, no good reason it never got merged, it just fell off my radar after I no longer needed it. I'll put it back on my radar and see if I can get it back on track, though be warned I have little time nowadays.
Have you tried out the unmerged version @yjarrar and, if so, any feedback?
Was hoping to see it merged before using, but I will start using it now and let you know how it goes.
Hi @yjarrar I wouldn't recommend using this old contexts branch long term, but if you could try it out briefly and confirm whether it works, makes sense, and is reasonably usable, that would be very helpful.
I actually use this branch and it works well. I also hope this PR is merged into the main branch.
Hi ππ» Awesome work! Is there a plan to merge this feature? This is exactly what we'd need! Let me know if I can help ππ»
Thanks for the feedback everyone!
I have put some time today into bringing the branch with the "contexts" feature up-to-date in preparation to test, polish, and eventually merge this feature to the main
branch.
I put the updates in a new branch to avoid the risk of breaking anyone using the original branch, and to have a place for this work directly in this repo rather than a fork. Please see the multiple-passphrases-in-one-repo-wip
branch in this repository, and the related PR https://github.com/elasticdog/transcrypt/pull/145
However, please keep comments and feedback in this thread as much as possible to avoid forking the conversation.
One problem I have noticed β and which I don't yet know how to solve βΒ is how to make Git's textconv
filter work with different password contexts.
Unlike all the other Git internals we rely on, textconv
does not seem to have a way to access the original path of a file it is operating on because it is only provided with a temporary file copy. Without the original file path, we don't have a way to look up the context name for the file and therefore cannot find the right password. See the documentation here where there is no mention of a %f
option, like we have and rely on for clean
smudge
operations.
Up until now the contexts implementation has (accidentally) ignored this issue and always used the password of the default/unspecified context. This will be okay in some cases, but I can foresee nasty problems using git diff
or git log
for secret files in non-default contexts unless I can find a way to work around this limitation. The problem area of code is here: https://github.com/elasticdog/transcrypt/blob/multiple-passphrases-in-one-repo-wip/transcrypt#L213-L215
Hi @benclanet @acro5piano @yjarrar and any others interested in the contexts feature, can you please do some testing of the updated multiple-passphrases-in-one-repo-wip branch to confirm whether it works for you, and if you come across any issues?
I think that branch is in reasonable shape and it can be cleanly merged into main
for longer-term testing, provided there aren't any major problems. However it's only lightly tested so far and includes non-trivial changes, so be careful not to test in vital repositories or in any way where bugs could lose valuable data.
Also note that I was forced to change the underlying approach due to the textconv
issue discussed above, so the required .gitattribute patterns are different. Instead of using a single extra crypt-context=some-name
attribute in the pattern, you must now use a pattern like the following that includes the context name in every attribute value:
# .gitattributes: Encrypt 'top-secret' file in the "super" context
top-secret filter=crypt-super diff=crypt-super merge=crypt-super
Hi @jmurty Thank you so much for your effort. I tested the branch and confirmed it working correctly.
My test repository is here. I logged every commands for easy reproduction. https://github.com/acro5piano/transcrypt-context-test
Thanks very much for testing this @acro5piano and for showing your work in the linked repo. And thanks for your patience too, I see a comment from you about this feature way back in 2020!
Your confirmation has given me the confidence to merge the new contexts feature to the main
branch, see commit https://github.com/elasticdog/transcrypt/commit/e08c3595e6ec57bc448e50401cb7ed845d866419
We can keep testing it there and, all going well, roll it out in a 2.3.0 release in the not-too-distant future.
@jmurty I'm glad to hear that. Thank you for your amazing work!