git-lfs
git-lfs copied to clipboard
gitattributes(5) should control paths' desired Git LFS extensions
Currently, all extensions declared/registered in git-config(1) under lfs.extension.*
are invoked when the LFS filter driver cleans.
https://github.com/git-lfs/git-lfs/blob/c6fc8813fd73bdfb392e4ae74cf061e48688b1f2/lfs/gitfilter_clean.go#L19-L30
There is no way to specify per-file which extensions should be invoked, despite extensions being named. Additionally, the mere presence of extension declarations causes their invocation, which may be confusing, and you cannot opt-out of this behavior. (Inability to at least opt-out makes global or system LFS extension declarations impractical, even if they're shared between repositories.)
I propose that extensions should be opt-in via gitattributes(5) attributes. They could be handled as follows:
lfs-extension
-
- Set
-
Use the value of the
lfs.extensions
configuration variable to decide what LFS extensions to Invoke on a path. - Unset
-
Do not invoke any LFS extensions on a path.
- Unspecified
-
Use the value of the
lfs.extensions
configuration variable to decide what LFS extensions to invoke on a path, except when the variable is not set.For backwards compatibility, when the
lfs.extensions
variable is not set, all LFS extensions known to LFS are invoked on a path. - String
-
Specify a comma separated list of LFS extensions to invoke on this path in the same format as the
lfs.extensions
configuration variable.
I also propose that default LFS extension behavior should be controlled by configuration variables. These could operate as follows:
lfs.extensions
-
Specify the default LFS extensions to invoke on paths. Defaults to the empty string (
""
), except for the case of an unspecifiedlfs-extensions
attribute (for backwards compatibility).This setting should be set to a comma-separated list of LFS extension names. Order is not significant.
Finally, I propose that new Git LFS installations should default lfs.extensions
to the empty string system-wide, whenever possible.
These proposals allow LFS extensions to continue to operate without breakage today, but also set up for much more robust & transparent configuration of LFS extensions going forward. Paths in the same repository can utilize different extension sets, users can declare reused extensions globally without ruining other repositories (especially useful due to lfs.extension.*
keys' unsafety in .lfsconfig
(#4044)), and repositories for which always using all local LFS extensions on paths works can continue to do so more reliably via lfs.extensions
.
I do have some open questions, though.
-
Is
lfs.extensions
lfsconfig-safe?On the bright side, repositories could only specify string names of LFS extensions that the user would already need installed, but these extension names have no uniqueness guarantees, and a malicious repository that knows the (user-decided) name of a user's extension that leaks sensitive data when smudging could pose enough of a security risk. Would it be enough to document against this concern and encourage only globally declare extensions with both names unlikely to collide & semantics that don't leak data until configured with the repository?
-
Do we want to warn about legacy
lfs.extensions
behavior?Using all known extensions by default feels unsecure to me, even if users normally don't register extensions globally or system-wide.
Also, this would push existing repositories to explicitly specify their LFS extensions (in
.lfsconfig
or post-clone documentation), meaning new Git LFS installations (withgit config --system lfs.extensions ""
) will just work. -
Should extensions documentation continue to use "register" for the act of placing extensions in git-config(1)?
I feel like this act is much more "declare" now, with "register" being probably best for placing extension names in
lfs.extensions
orlfs-extensions
to be invoked in the future. Extensions would be declared (system-wide, globally, or locally) to allow registration, then registered (normally locally) to allow invoking, then invoked (with a path) when Git invokes the LFS filter driver.
Note that when the LFS filter driver smudges, a blob's extension set from its previous clean are read from its LFS pointer, avoiding this problem.
https://github.com/git-lfs/git-lfs/blob/c6fc8813fd73bdfb392e4ae74cf061e48688b1f2/lfs/gitfilter_smudge.go#L122 https://github.com/git-lfs/git-lfs/blob/c6fc8813fd73bdfb392e4ae74cf061e48688b1f2/lfs/gitfilter_smudge.go#L135-L147
Hey,
Thanks for the issue, and sorry it's taken me so long to get back to you. I'm definitely open to seeing a difference in how extensions operate; the feature is considered experimental and I don't get the impression it gets a lot of use.
I think your proposal for using gitattributes is fine, except that I'm not sure we can rely on a variable being unset. If a user has lfs.extensions
set globally, it's not possible to unset it with Git. We could add an additional key in addition to clean
, smudge
, and priority
called enabled
or default-enabled
and default it to true if unset, which would provide backwards compatible behavior.
I think if the extension names in .gitattributes
are safe, then they're also safe in .lfsconfig
, because both are stored in the repository. Git already does that for diff and merge drivers, and that's not a security problem.
I don't think I want to warn about old behavior because it means every existing repository gets a warning, which is irksome. I also think that “register” is fine; I'm not particular about the terminology, and I don't think “register” versus “declare” provides a meaningful semantic difference.
I also don't like the unset variable subtlety in this proposal, but I don't know a cleaner way to deal with backwards compat while not messing up a potential future breaking change (when people have switched off) to make unspecified lfs.extensions
the same as specified empty string. My worry about breaking an (admittedly experimental & uncommon) feature like this is that places that depend on Git LFS triggering for encryption or other security purposes could upgrade their LFS installs, not know that their existing extensions aren't being respected anymore, and then publicly leaking data. On that note, an upgrade through a *nix package manager wouldn't probably have a great time managing only setting system-wide lfs.extensions
to the empty string for new installs, so it'd be safer to just warn on lfs.extensions
not being set for a few releases until we're confident that everyone has set it and lfs.extensions
can finally truly default to the empty string, no system-wide config change weirdness necessary.
The idea is that anyone who sets lfs.extensions
to the empty string for their install will also know to upgrade all existing LFS-extension–using repositories to more explicit semantics.
Ignoring backwards-compat, (initial) unspecified lfs.extensions
emulates the unsafe (IMO) behavior of invoking all known extensions all the time; repositories can emulate the safe behavior by setting lfs.extensions
in .lfsconfig
with all the extensions they use. This could be clearer in the docs?
Glad to know that lfs.extensions
would be safe in .lfsconfig
; I think that makes this transition a lot easier, and allows LFS to do things like warn when you're missing extensions on new blobs, potentially failing the operation until you set up the extension. (Config variable for that?)
"Register" semantics you describe are fine by me. (Declare vs. register is a nit, and bikeshedding it is ultimately unimportant.)
We wouldn't be warning on every repository, but rather only those invoking extensions? An extension-avoiding install never touches any of these semantics, or their changes.
I thinik lfs-extensions
is also a better attribute name than lfs-extension
, since it's controlling a set (like the lfs.extensions
configuration variable is). Going to switch to that now.
I don't quite understand what you mean by attributes lfs-enabled
/lfs-default-enabled
? Does its unspecification for a path enable all known extensions (from everywhere), as current?
If so, I think a much clearer name would be lfs-extensions-default-all-known
, which is a painfully long name, but that's the point. With that, we could have attributes:
lfs-extensions
-
- Set
-
Use the value of the
lfs.extensions
configuration variable to decide what LFS extensions to Invoke on a path. - Unset
-
Do not invoke any LFS extensions on a path.
- Unspecified
-
Use the value of the
lfs.extensions
configuration variable to decide what LFS extensions to invoke on a path, except when a path'slfs-extensions-default-all-known
attribute is unspecified or set.lfs-extensions-default-all-known
is an attribute that should be unset for all paths whenever possible, but for backwards compatibility that may not be the case. See that attribute's documentation for more details. - String
-
Specify a comma separated list of LFS extensions to invoke on this path in the same format as the
lfs.extensions
configuration variable.
lfs-extensions-default-all-known
-
Deprecated. The
lfs-extensions-default-all-known
attribute allows subtle & backwards-compatible behavior with older versions of LFS, and should be unset for all paths whenever possible.Previously, all known LFS extensions were invoked by default on paths. Not only would LFS invoke only extensions a repository locally desires, but all known LFS extensions (including those registered system-wide or globally) would be invoked. This attribute exists to support transitioning to safer, opt-in behavior, e.g. portably setting the
lfs.extensions
configuration variable in.lfsconfig
.If
lfs-extensions-default-all-known
is set, Git prints a warning when running the LFS filter driver.- Set
-
Deprecated. When a path leaves its
lfs-extensions
attribute unspecified, ignore the configuration variablelfs.extensions
entirely and invoke all LFS extensions known on that path, including extensions registered system-wide or globally. This attribute should never be set. - Unset
-
Rely on normal
lfs.extensions
behavior for this path. When thelfs.extensions
configuration variable is unspecified, it will default to the empty string and no LFS extensions will be invoked by default, even when registered (system-wide, globally, or locally), making extension use opt-in. - Unspecified
-
Deprecated. When a path leaves its
lfs-extensions
attribute unspecified, ignore the configuration variablelfs.extensions
entirely and invoke all LFS extensions known on that path, including extensions registered system-wide or globally. This attribute should be unset as soon as possible, after migrating the repository over to safer, opt-in LFS extension behavior.
I'd like to not support setting lfs-extensions-default-all-known
if we can get away with it. Could we alternatively completely error if a path has explicitly set that attribute? (It's inconsistent with other Git attributes, but this seems sane, and prevents our old, unsafe behavior from hanging around longer.)
And our configuration variables become simpler:
lfs.extensions
-
Specify the default LFS extensions to invoke on paths. Defaults to the empty string (
""
).This setting should be set to a comma-separated list of LFS extension names. Order is not significant.
My proposal for enabled
or default-enabled
was in the Git configuration:
[lfs "extension.foo"]
clean = foo clean %f
smudge = foo smudge %f
priority = 0
default-enabled = false
I don't think we need an additional .gitattributes
attribute for this, especially one that we're immediately deprecating. My proposal lets people control whether their extension are enabled or not when lfs-extensions
and lfs.extension
are unspecified; the default is to be backwards compatible (and thus enabled), but folks can then disable the extensions by default once they no longer need them and have them off.
My concern with adding a warning is that some people never see warnings because they use tools in a non-interactive way, so it's not effective. It's also annoying once you've seen the warning once or twice. A lot of Git LFS users work on things like games or on Windows, where they may interact entirely through a UI and never see the command line tool and no matter how much we scream, they'll never notice. We therefore need to provide something backwards compatible at least initially.
While I am sympathetic to the fact that there are some unpleasant implications for encrypted blobs, I'm not in general convinced that encrypting all data in a Git repository is a good idea, and degrading gracefully is the way that Git works, so absent a compelling reason for us to differ from Git's behavior, I'd like to do the same thing Git does.
I'd also like to ask you, once we have a final design, to open a PR to add this to the docs/proposals
directory. Do note that, like most major design proposals, this will require someone to implement it; while we are definitely open to seeing improvements here, the core team is small and we probably won't have sufficient time to get to a major redesign.
Ah, a per-extension configuration variable makes sense and is a lot less obtrusive for repositories.
However, I don't like how new extensions could sneak back on by default if registered through something like a system package (which hopefully wouldn't set default-enabled = false
to maintain backwards-compat for their users). Is there value in allowing specific extensions to use legacy behavior, where a user would set default-enabled = false
for a few extensions but leave it on for others, instead of modifying their relevant .git/config
s for repositories?
If a new LFS extension is packaged, then it could reasonably set default-enabled = false
for itself, forcing users to debug why one extension isn't working the same as the rest of their extensions.
I think this ultimately has the same issues as a global configuration variable lfs.extensions-default-enabled
, i.e.
[lfs]
extensions-default-enabled = false
which would be a lot simpler to check, and would only likely be system-assigned by Git LFS packagers (meaning extension behavior breaking by default would be much more unlikely.)
A lot of Git LFS users work on things like games or on Windows, where they may interact entirely through a UI and never see the command line tool and no matter how much we scream, they'll never notice.
Okay, that makes sense. I was mainly basing off of the current unsafe .lfsconfig
key warning behavior.
We therefore need to provide something backwards compatible at least initially.
The options I've described are backwards-compatible, but don't require a new configuration variable under every registered extension. The lfs.extensions-default-enabled
configuration variable would likely be globally scoped, and the lfs-extensions-default-all-known
would effectively be locally-scoped (* lfs-extensions-default-all-known
in .gitattributes
, checked in).
While I am sympathetic to the fact that there are some unpleasant implications for encrypted blobs, I'm not in general convinced that encrypting all data in a Git repository is a good idea, and degrading gracefully is the way that Git works, so absent a compelling reason for us to differ from Git's behavior, I'd like to do the same thing Git does.
I'm not sure what you mean here?
I'd also like to ask you, once we have a final design, to open a PR to add this to the
docs/proposals
directory. Do note that, like most major design proposals, this will require someone to implement it; while we are definitely open to seeing improvements here, the core team is small and we probably won't have sufficient time to get to a major redesign.
Sure thing. I'd be willing to try my hand at a patch for this too.
I think this ultimately has the same issues as a global configuration variable
lfs.extensions-default-enabled
, i.e.[lfs] extensions-default-enabled = false
which would be a lot simpler to check, and would only likely be system-assigned by Git LFS packagers (meaning extension behavior breaking by default would be much more unlikely.)
Sure, I can go for that. Maybe lfs.extensions.default-enabled
?
We therefore need to provide something backwards compatible at least initially.
The options I've described are backwards-compatible, but don't require a new configuration variable under every registered extension. The
lfs.extensions-default-enabled
configuration variable would likely be globally scoped, and thelfs-extensions-default-all-known
would effectively be locally-scoped (* lfs-extensions-default-all-known
in.gitattributes
, checked in).
I'd rather have lfs.extensions.default-enabled
be the knob and not have a .gitatributes
entry for enable or disabling legacy behavior. It's a lot easier to make changes that involve backwards incompatible behavior (like changing the default of whether extensions are enabled) when it's only in the config. When we put incompatible changes into the repository by storing them in .gitattributes
, we're guaranteeing to essentially support them forever.
While I am sympathetic to the fact that there are some unpleasant implications for encrypted blobs, I'm not in general convinced that encrypting all data in a Git repository is a good idea, and degrading gracefully is the way that Git works, so absent a compelling reason for us to differ from Git's behavior, I'd like to do the same thing Git does.
I'm not sure what you mean here?
I was referring to your comment about using encrypted blobs as extensions. I agree we should have a default behavior that's backwards compatible at first. We also need to consider what happens if an extension requested in lfs.extensions
doesn't exist; in that case, we need to ignore it as if it were not there, since that's what Git does for things like filters.
Sure thing. I'd be willing to try my hand at a patch for this too.
Great. We're happy to help you with any questions or code issues you might have.
Hey,
We're considering doing a 3.0 release relatively soon, and if you were interested in including this, that would be great. That would also be a great time to change extensions to default to off, since it would be a breaking change. Then we wouldn't need lfs.extensions.default-enabled
here.
Let us know if that's a thing you're still interested in.
I'm interested in this as well.
We're happy to accept a patch for this, but v3.0 is upcoming soon, so if someone wanted to make an incompatible change, that would be the time.
+1
I started a bounty for this issue, if anyone would like to contribute to it: https://app.bountysource.com/issues/93535439-gitattributes-5-should-control-paths-desired-git-lfs-extensions