hatch icon indicating copy to clipboard operation
hatch copied to clipboard

Hatch ignores global .gitignore, leaking private keys

Open skorokithakis opened this issue 8 months ago • 12 comments

I built a package using hatch, which, unbeknownst to me, ignored my ~/.gitignore_global, including in the package my .envrc with secret keys.

This is a fairly major footgun, as I imagine quite a few packages like this will include files that their authors didn't intend (files which git ignores, so the authors don't realize will be included).

skorokithakis avatar Apr 18 '25 18:04 skorokithakis

Hey there Stavros, good to see you again!

There is another similar issue open which I will fix by introducing an option to use the Git CLI. Something to keep in mind about why I choose files by default is because building from the source directory on your local machine or CI is not the only way projects are built and indeed it is the rarest situation. More often than not people build packages using a GitHub release archive or even more frequently the source distributions from PyPI. In these cases there is no Git checkout to speak of.

ofek avatar Apr 18 '25 19:04 ofek

Haha, hello! The internet is small.

I definitely agree, but it's usually a good idea to ignore all the files in all the .gitignore files by default, as they're usually build artifacts and the like. What do you mean when you say "the git CLI", would this not just be hatch parsing the ignore files in the same way git does?

skorokithakis avatar Apr 18 '25 19:04 skorokithakis

Yes ignore files within the repo are used by default but Git takes into account locations other than the project directory which is the issue here.

ofek avatar Apr 18 '25 19:04 ofek

Agreed, but I mean, are you planning to call out to git somehow to get a list of files, or to reimplement its logic?

skorokithakis avatar Apr 18 '25 19:04 skorokithakis

To call it directly when the new option is enabled.

ofek avatar Apr 18 '25 19:04 ofek

I see, thanks!

skorokithakis avatar Apr 18 '25 19:04 skorokithakis

Happy to help! I will post the other issue here sometime tonight when I find it.

OT: this is still one of my favorite contributions https://github.com/skorokithakis/catt/pull/92

ofek avatar Apr 18 '25 19:04 ofek

Hahah that was great, I loved it. Such a good PR.

skorokithakis avatar Apr 18 '25 19:04 skorokithakis

+1 this is really really bad

umarbutler avatar May 03 '25 03:05 umarbutler

Just a small additional mention: it's going to be perfectly ok if you use git itself to list the files to include as git knows about the following, but be aware that apart from the global gitignore, people might also use <repo>/.git/info/exclude to ignore files, so in case the implementation shifts for some reason to "use local and global gitignore" rather than "use git cli", be aware that there might still be an issue.

Also, to put matter into perspective: there may be a scenario where people could change their usual procedure: if they have to issue a security fix release. While it's best if people manage to use their usual procedure to release, if they have be privately made aware of a security issue and they want to issue a patch release before communicating so as to communicate on the issue AND the patch that fixes it, and they don't want to make public PRs and everything, it might be one of the most common situations where people who usually build their sdist safely in the CI end up doing a local build and upload, and it's probably quite annoying if this ends up being the time where they upload a secret dotfile to PyPI because of this bug.

Is there a strong reason why by default sdist takes everything while wheels only take the python code ? It feels like for 99% (estimated figure) of packages (the pure-python ones), the python code would suffice, and for the 1% of python packages that actually build something, it might be safer to operate on an allow-list and let users define what they need to put on a sdist, rather than a deny-list where everything would be published unless explicitly omitted, no ?

ewjoachim avatar May 11 '25 15:05 ewjoachim

In response to the comment by @ewjoachim, I just experienced this exact situation. I had placed my token in a file tracked by .git/info/exclude and it happened to get included in the dist. Fortunately PyPI flagged it in under 10 minutes so no harm here.

This was admittedly my fault for not reading the docs closely enough in that I didn't put it in .gitignore, but it'd be a nice-to-have for sure. Not sure about handling global .gitignores, but perhaps looking at both .gitignore and .git/info/exclude would be easy to implement? (Happy to make this a separate issue if that's a better spot to discuss.)

mitchnegus avatar Jul 14 '25 00:07 mitchnegus

In terms of implementation, I guess if there's a .git and the git command is available, then git ls-files will list all the relevant files, using all the gitignore mechanisms. This seems the sanest ?

ewjoachim avatar Jul 14 '25 06:07 ewjoachim