safe mode to disable executing any external programs except git
As described in #2020, here is the core implementation of "safe mode". The core idea is to set up operations so that external programs are not executed by git. This has been a major source of vulnerabilities.
This means that network connections are limited to HTTPS. As much as possible, this will rewrite remote URLs to HTTPS. This is necessary so that submodules work even when they do not use HTTPS URLs, as long as they are public, HTTPS-accessible repos.
~~This is a draft to confirm the approach. Then I will follow up and polish everything for merging.~~
closes #2020
I would like to write tests, I looked around through the test suite, but it still escapes me how to structure these tests. Since these options affect how git is executed, it seems like it would have to connect to the network? Or is there a test that already replaces the protocol helper, e.g. git-remote-https, to avoid network access?
It really isn't easy to test it at all, and probably impossible to test it exhaustively. So I am fine admitting defeat on this one, particularly because the tests I could imagine would be so specific and spotty that they barely have any value.
if we can only run git, that could mean we pick #2027 over #2026. I don't think it makes sense to have an allowlist of other executables GitPython could run. @EliahKagan, your thoughts?
if we can only run
git, that could mean we pick #2027 over #2026. I don't think it makes sense to have an allowlist of other executables GitPython could run. @EliahKagan, your thoughts?
~~I haven't reviewed this draft PR in detail, and~~ I also don't know what it will be like and how it will document the "safe mode" feature once it's done. But my first impression is that the changes being proposed here should not directly impose any requirements on whether is_cygwin_git runs an external subprocess besides git:
- Although the PR title makes it seem like this prevents all external subprocesses except
gitfrom being executed, it is otherwise described more specifically to preventgitsubprocesses from executing other external subprocesses. - If this covers all subprocesses GitPython creates, rather than only those through
git, then there are already more subprocesses that need to be suppressed: at leastps. - If these changes are to be "minimally invasive" as characterized in #2029 (review), then I think they shouldn't impose requirements on code that doesn't operate on any repository or use any repository or URL specific paths or state.
However, that may not be the whole story. Although I don't think "safe mode" should prohibit the use of subprocesses in is_cygwin_git, it may be that they should be avoided in general when not needed for the same reasons. As currently written, it takes some effort to verify that is_cygwin_git does not introduce an untrusted search path vulnerability.
Furthermore, regarding the use of uname in is_cygwin_git, while I think it's probably not a vulnerability, it's not really ideal to assume adjacent executables are similarly trusted. It might be considered a valuable security enhancement to avoid assuming that just because an executable has a standard name and resides in the same directory as git that it is safe or reasonable to execute it.
FYI this is my very simple test suite:
#!/usr/bin/python3
import os
import git
import tempfile
for url in [
'[email protected]:fdroid/ci-test-app.git',
'ssh://gitlab.com/fdroid/ci-test-app.git',
'git://gitlab.com/fdroid/ci-test-app.git',
]:
print('====================', url)
d = tempfile.mkdtemp(prefix='foo_py_')
repo = git.Repo.clone_from(url, d, safe=True)
# clone from fake URL should prompt for password
d = tempfile.mkdtemp(prefix='foo_py_')
for url in [
'https://github.com/asdfasdfasdf/adsfasdfasdf.git',
'https://gitlab.com/asdfasdfasdf/adsfasdfasdf.git',
'https://codeberg.org/asdfasdfasdf/adsfasdfasdf.git',
]:
try:
repo = git.Repo.clone_from(url, d, safe=True)
except git.exc.GitCommandError as e:
print('repo', e)
url = 'https://gitlab.com/fdroid/ci-test-app.git'
d = tempfile.mkdtemp(prefix='foo_py_')
print(d)
repo = git.Repo.init(d, safe=True)
origin = repo.create_remote("origin", url)
origin.fetch()
d = tempfile.mkdtemp(prefix='foo_py_')
print(d)
repo = git.Repo.clone_from(url, d, safe=True)
print(type(repo))
print('SUCCESS')
I think I should leave it to @EliahKagan to review this PR, due to its nature of being very relevant to the security of the project.
To me it still is the question if there is a good case for supporting this - I'd think the answer should be "yes" if f-droid benefits, but I fear that it lures people into a false sense of safety. Maybe safe=true should be changed to something more specific at the very least, because it's definitely not safe by all means.
F-Droid has been using this approach for years, and will switch to this code once it is merged. As for the name, I'm open to suggestions. I used the word "safe" following the example of parser libs I've seen, where it means features are disabled in the interest of security. Like ruamel.yaml uses "safe" mode to mean a YAML parser that does not parse anything but lists, dicts, and scalars (e.g. no objects). From what see, I think this clearly does make it safer to use.
I think this is ready to go. Let me know if there is anything I can provide to help with the review and merge.
@EliahKagan anything I can help with? Any other blockers here? I rebased this on the latest main.
The test failure seems to be unrelated, something with it can't connect to localhost