transcrypt PROPOSAL: Add support for multiple passphrase "contexts" in one repo

PROPOSAL: Add support for multiple passphrase "contexts" in one repo

Open jmurty opened this issue 5 years ago • 21 comments

Hi, thanks for your great work on transcrypt! We have found it really useful.

We are considering using transcrypt for a scenario it wasn't designed for, where we want to have multiple "contexts" of secret file within the one repository with a different passphrase for each one. Basically, we have some files in a repo that should be encrypted but available to user group A, and another set of files that should only be available to user group B.

This PR contains a very rough but working proof of concept with transcrypt updated to support this use case, where different security "contexts" can be specified by setting a CONTEXT environment variable before running the script.

If you are at all interested in adding such a feature please let me know here, along with any thoughts about how you would prefer it to be built. We can work on improving the implementation if you think it might get included in the official version later?

The implementation here so far is a quick and dirty hack that duplicates all the git hook scripts for each named context, so that each one knows about its particular settings.

Despite being an awful hack it shows that the process works fairly well, with just those files for which the user has set the appropriate context & passphrase decrypted, and the rest left as they were.

Example usage:

Choose a name for a non-default encryption "context" e.g. "super-secret"

Init transcrypt in default and "super-secret" contexts:

$ transcrypt
# follow prompts

$ CONTEXT=-super-secret transcrypt
# follow prompts, note leading dash (-)

Set patterns in .gitattributes as usual for default decryption, and with the suffix "-super-secret" for that additional context:

# Contents of .gitattributes
secret.txt filter=crypt diff=crypt
super-secret.txt filter=crypt-super-secret diff=crypt-super-secret

Write your secret files, git add them, and commit.

Note the different passphrases for default and additional contexts

$ transcrypt --display

$ CONTEXT=-super-secret transcrypt --display

Clone repo elsewhere, and confirm that you can or cannot see file contents as you apply some combination of the init commands as printed by the display commands above.

Jun 07 '19 05:06 jmurty

Hey @jmurty! I'd certainly be open to coordinating with you to add this feature...you're actually not the first person to suggest it (see https://github.com/elasticdog/transcrypt/issues/25), so I'm sure it would be useful to others.

As far as the user experience, I wonder if using actual option flags to the script would make it more discoverable compared to introducing the new mechanism of environment variables. Using options would also allow us to add something like --list-contexts, but I also like the idea of everything being backward compatible and using a default context without any extra effort for users that wouldn't need them. What do you think?

Note, I've recently pushed some automated reformatting commits (using shfmt), so it looks like you've got a few merge conflicts now with the current implementation.

Jul 19 '19 22:07 elasticdog

Hi @elasticdog thanks for the feedback and the pointer to #25. Your quick summary there about required changes lined up pretty well with my experience: there were quite a few changes needed, all fairly small but spread out.

I agree that proper option flags is the right way to go. The code in this PR so far is full of quick and dirty hacks to show how multiple "contexts" could work, but definitely isn't how I think they should work. As such, I'm not worried about the merge conflicts as I'd start over anyway, working towards a cleaner and more robust implementation before I'd consider it a real candidate to be merged.

I will spell out in more detail below how I think it should work and some questions I have. If you could let me know if I'm way off track or what you think in general that would be great. I will aim to build a better implementation along these lines.

You should also be aware that I don't have a huge amount of bash scripting experience so will probably make some rookie mistakes along the way, keep an eye out for those.

Design sketch for multiple password "contexts"

Naming:

I'm not sure that "contexts" is the best name for this feature, is there a better name that would more clearly express what it does? "Levels", "Groups", ... ?

Usage:

ensure everything works as before for users who do not use extra contexts. Users shouldn't have to know or care about this feature unless they specifically need it
support a new optional --context [context-name] option flag that can be combined with all relevant other options
- the context name must comply with some rules so we can safely use it within settings files and scripts. Probably something like alpha-numeric characters only, case-insensitive (force to lower-case internally)
- when used alone or with --cipher and/or --password will init a new encryption "context" with its own section in settings
- --context would cause the script to target a specific context when combined with many other options: --display, --rekey, --flush-credentials, --uninstall, --list, --export-gpg, --import-gpg
- when multiple contexts are configured and the --context option is not provided, other options would process or print info about all existing contexts. E.g. --display would list all contexts, --list would list files for each context in turn, etc
  - this might be too ambitious? A simpler alternative would be to print a reminder to users that they have multiple contexts, e.g. if a user runs transcript --rekey the output might say something like: "Files in the default context have been rekeyed. Files in [2] other contexts were not affected"
add documentation about all of the above

Implementation:

handle just the --context option at top of script to adjust various constants to be context-specific, similar to 1ab83c8b but using an option flag instead of environment variable
update helper scripts to recognise and handle non-default context names somehow, so unlike 1ab83c8b just one copy of each helper script is needed but it behaves differently for different contexts
- I'd probably re-use the same environment variables already set at the top of the script for this
- When writing the filter settings for a context, adjust the commands to pass through any context-specific data needed by the helper scripts

Jul 22 '19 00:07 jmurty

Hi @elasticdog I have made a start on the cleaned-up design I outlined above. I'd appreciate any feedback you have.

Jul 29 '19 03:07 jmurty

any updates?

Aug 07 '20 19:08 acro5piano

Hi @acro5piano I can try to resurrect this PR and bring it up-to-date with the latest changes on master. Would you be able and willing to test the contexts feature if I can get it working again?

Aug 08 '20 14:08 jmurty

Yes sure. I would like to try the multiple passphrase feature. Thanks.

Aug 08 '20 15:08 acro5piano

Hi @acro5piano I think I have gotten this PR branch into decent shape and a new, moderately comprehensive, unit test suite for contexts is now passing.

Can you please check out the PR branch and see if it works for you? See the basic instructions here for hints on how to use contexts.

Aug 09 '20 14:08 jmurty

Hi @acro5piano I have greatly cleaned up and improved the contexts implementation, so if you are testing please try this latest version. The setup instructions have changed, you now use a new crypt-context attribute in .gitattributes to match patterns to a context. See the updated instructions.

Aug 12 '20 11:08 jmurty

Hi @elasticdog I have revisited this contexts work and greatly cleaned it up. I would appreciate any feedback or thoughts you have about it.

The commit trail is a mess, but if you check the final changes I don't think they are too disruptive.

Key points:

extra contexts are defined by init-ing them with transcrypt --context=my-context and assigning corresponding file patterns in .gitattributes with the crypt-context=my-context attribute
you list your contexts, and see misconfiguration hints when appropriate, with transcrypt --list-contexts
list all crypt-ed files with git ls-crypt or just those in a specific context with git ls-crypt-my-context or git ls-crypt-default (default/no context)
the smudge/clean/textconv scripts now check for the crypt-context attribute, which might affect performance
the contexts features are reasonably covered by tests
there are some unrelated cleanup changes to the merge script in this PR.

Aug 12 '20 11:08 jmurty

@jmurty Awesome work. I'll try it soon.

Aug 13 '20 07:08 acro5piano

@jmurty Hi, I've tried your version and it's working precisely. Thank you for your fantastic work!

One question, what does ts mean in the instructions? I think more concrete name is comprehensive for those who use this feature for the first time, such as:

admin
superuser
staging
production

Aug 13 '20 07:08 acro5piano

Hi @acro5piano that's great! Thanks for testing and confirming it works for you.

You're right that the ts example name is bad, I only used it to try to squeeze the .gitattributes echo command in the help text within 79 characters. I will try to find a better solution than using a short, but bad example name.

Aug 13 '20 11:08 jmurty

Hi @acro5piano I have improved the help text for contexts, along with a few bug fixes. Is this text better now? https://github.com/elasticdog/transcrypt/blob/b2db9284d1d079b83acb74e5890c54b01e8fe0bf/transcrypt#L1085-L1103

Aug 13 '20 13:08 jmurty

@jmurty Thank you for the improvement! Yes I think the new version is much comprehensive.

Aug 13 '20 16:08 acro5piano

Is the plan to have this be merged after the 2.1.0 release work, or would it be a part of it?

Aug 13 '20 22:08 elasticdog

@elasticdog Let's leave the contexts work to settle instead of rushing it into the 2.1.0 release, since I haven't used it at all myself day-to-day and I'm still finding rough edges and bugs.

I'll merge it to master after the next release, and when it's a little more tested.

Aug 13 '20 23:08 jmurty

Is there a reason this never got merged? just new conflicts since last approved?

Jun 07 '22 01:06 yjarrar

Oops, no good reason it never got merged, it just fell off my radar after I no longer needed it. I'll put it back on my radar and see if I can get it back on track, though be warned I have little time nowadays.

Have you tried out the unmerged version @yjarrar and, if so, any feedback?

Jun 08 '22 10:06 jmurty

Was hoping to see it merged before using, but I will start using it now and let you know how it goes.

Jun 13 '22 17:06 yjarrar

Hi @yjarrar I wouldn't recommend using this old contexts branch long term, but if you could try it out briefly and confirm whether it works, makes sense, and is reasonably usable, that would be very helpful.

Jun 14 '22 00:06 jmurty

I actually use this branch and it works well. I also hope this PR is merged into the main branch.

Jun 14 '22 01:06 acro5piano

Hi 👋🏻 Awesome work! Is there a plan to merge this feature? This is exactly what we'd need! Let me know if I can help 👍🏻

Oct 14 '22 15:10 benclanet

Thanks for the feedback everyone!

I have put some time today into bringing the branch with the "contexts" feature up-to-date in preparation to test, polish, and eventually merge this feature to the main branch.

I put the updates in a new branch to avoid the risk of breaking anyone using the original branch, and to have a place for this work directly in this repo rather than a fork. Please see the multiple-passphrases-in-one-repo-wip branch in this repository, and the related PR https://github.com/elasticdog/transcrypt/pull/145

However, please keep comments and feedback in this thread as much as possible to avoid forking the conversation.

Oct 15 '22 14:10 jmurty

One problem I have noticed – and which I don't yet know how to solve – is how to make Git's textconv filter work with different password contexts.

Unlike all the other Git internals we rely on, textconv does not seem to have a way to access the original path of a file it is operating on because it is only provided with a temporary file copy. Without the original file path, we don't have a way to look up the context name for the file and therefore cannot find the right password. See the documentation here where there is no mention of a %f option, like we have and rely on for clean smudge operations.

Up until now the contexts implementation has (accidentally) ignored this issue and always used the password of the default/unspecified context. This will be okay in some cases, but I can foresee nasty problems using git diff or git log for secret files in non-default contexts unless I can find a way to work around this limitation. The problem area of code is here: https://github.com/elasticdog/transcrypt/blob/multiple-passphrases-in-one-repo-wip/transcrypt#L213-L215

Oct 15 '22 14:10 jmurty

Hi @benclanet @acro5piano @yjarrar and any others interested in the contexts feature, can you please do some testing of the updated multiple-passphrases-in-one-repo-wip branch to confirm whether it works for you, and if you come across any issues?

I think that branch is in reasonable shape and it can be cleanly merged into main for longer-term testing, provided there aren't any major problems. However it's only lightly tested so far and includes non-trivial changes, so be careful not to test in vital repositories or in any way where bugs could lose valuable data.

Also note that I was forced to change the underlying approach due to the textconv issue discussed above, so the required .gitattribute patterns are different. Instead of using a single extra crypt-context=some-name attribute in the pattern, you must now use a pattern like the following that includes the context name in every attribute value:

# .gitattributes: Encrypt 'top-secret' file in the "super" context
top-secret filter=crypt-super diff=crypt-super merge=crypt-super

Oct 21 '22 13:10 jmurty

Hi @jmurty Thank you so much for your effort. I tested the branch and confirmed it working correctly.

My test repository is here. I logged every commands for easy reproduction. https://github.com/acro5piano/transcrypt-context-test

Oct 23 '22 02:10 acro5piano

Thanks very much for testing this @acro5piano and for showing your work in the linked repo. And thanks for your patience too, I see a comment from you about this feature way back in 2020!

Your confirmation has given me the confidence to merge the new contexts feature to the main branch, see commit https://github.com/elasticdog/transcrypt/commit/e08c3595e6ec57bc448e50401cb7ed845d866419

We can keep testing it there and, all going well, roll it out in a 2.3.0 release in the not-too-distant future.

Oct 23 '22 04:10 jmurty

@jmurty I'm glad to hear that. Thank you for your amazing work!

Oct 23 '22 05:10 acro5piano

transcrypt transcrypt copied to clipboard

PROPOSAL: Add support for multiple passphrase "contexts" in one repo

Design sketch for multiple password "contexts"

transcrypt
transcrypt copied to clipboard