gitoxide Fix gix-sec check for administrators group folder ownership

trafficstars

I recommend against merging this yet, because I want to do more research to check that I am not inadvertently introducing a security vulnerability.

Background

Like Git, gitoxide avoids performing some operations in directories that it regards not to be owned by the user account whose identity is used to run it. On Unix-like systems, this is a fairly simple comparison.

On Windows, it is not so simple, and there are some special cases where Git considers the "current user" to own a directory, and where gitoxide intends to do the same. In most such cases, gitoxide's behavior seems to match that of Git.

But in the case of a directory owned by the administrators group when gitoxide is running as an administrator--which Git treats as a case of the user owning the directory, and which gitoxide intends to treat the same way--gitoxide does not succeed.

In this case, it seemed like it might be less confusing to have one main place to discuss this, so I've included details of the bug in this pull request. But I'd be pleased to also open an issue if that would be useful.

The bug

The problem is that the code in gix-sec/src/identity.rs checks if the SID (represented as a pointer to the SID struct) for the user identity that gitoxide is running as (i.e., the process or thread) represents the administrators group:

https://github.com/Byron/gitoxide/blob/0c5d1ff3f48aab43119f86501b14974f92c2017d/gix-sec/src/identity.rs#L190

But this SID, token_owner, is never the SID if the administrators group. The code instead means to check if the SID for the owner of the directory being examined represents the administrator's group, which can happen. That would require folder_owner to be used instead of token_owner.

This check, as it is currently written with token_owner, should never be true, so the remaining code should never run. But if it did, what it means to do next is to check if gitoxide is running as a user who is a member of the administrator's group. However, as written, it would check if the administrators group is a member of itself.

How this fix works

It can be fixed by changing both those occurrences of token_owner to folder_owner.

The first change is straightforward: the goal is to check something about the folder (directory), not about the running user.
The second change is less obvious but the reason it works is that, if control gets to that point, then the folder owner is the administrators group, and therefore folder_owner is a SID for the administrators group and can be used to check if the current access token is a member of the administrators group.

CheckTokenMembership does this automatically when its first argument is NULL. I am unsure if it is better to pass NULL (0 here) as the first argument, or to pass something related to token_owner explicitly. But passing NULL is simpler, seems intentional, and (as detailed below) corresponds to how Git does it.

This code was originally added in #426, and it looks from the code like the bug may actually have been present there, but I am not sure. It has gone through a number of changes, some to update it for using different libraries to access the Windows API functions, and some possibly for other reasons though I am not sure.

Other uses of token_owner and folder_owner do not seem to require modification.

Comparison to Git

This gix-sec code appears originally to have been based on or inspired by the functionality in Git that was introduced in https://github.com/git-for-windows/git/commit/bdc77d1d685be9c10b88abb281a42bc620548595.

The is_path_owned_by_current_sid function uses current_user_sid for the SID of the user that Git is running with, and uses sid for the SID of the directory owner.

It then:

Checks the owner of the directory is a member of the administrators group, by passing sid as the first argument to IsWellKnownSid with the WinBuiltinAdministratorsSid enumeration constant as the second argument, and if it is...
Checks if the current access token (by passing NULL to CheckTokenMembership as the first argument, and sid as the second argument) is a member of that owner. That owner is known to be the administrators group. So this is checking if it is being run by an administrator.

The corresponding code in gitoxide means to do this, with folder_owner corresponding to sid and token_owner corresponding to currrent_user_sid. But it passes token_owner when it should pass folder_owner (i.e., where the Git code rightly passes sid).

Testing

We don't have tests for this specifically and it might be challenging to write them, though not impossible.

However, I did discover the bug through a test failure on a Windows Server 2022 virtual machine in which I ran the tests as an administrator with UAC turned off. (This is not a good practice, and hopefully it is not a common one, but I wanted to see if there were any bugs in this hopefully rare scenario. It seems there is.)

When running cargo nextest run --all --no-fail-fast, all tests passed except for one:

     Summary [ 537.781s] 2359 tests run: 2358 passed (14 slow, 1 leaky), 1 failed, 9 skipped
        FAIL [   0.018s] gix-discover::discover upwards::from_dir_with_dot_dot
error: test run failed

Here's some more detailed information, produced by running RUST_BACKTRACE=1 cargo nextest run -p path+file:///C:/Users/Administrator/repos/gitoxide/gix-discover upwards::from_dir_with_dot_dot and omitting the leading "Compiling" lines:

    Finished `test` profile [unoptimized + debuginfo] target(s) in 15.04s
    Starting 1 test across 3 binaries (46 skipped; run ID: 76a06fc6-35ac-4b9e-b945-7d8fb69a3ed5, nextest profile: default)
        FAIL [   0.142s] gix-discover::discover upwards::from_dir_with_dot_dot

--- STDOUT:              gix-discover::discover upwards::from_dir_with_dot_dot ---

running 1 test
test upwards::from_dir_with_dot_dot ... FAILED

failures:

failures:
    upwards::from_dir_with_dot_dot

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 39 filtered out; finished in 0.12s


--- STDERR:              gix-discover::discover upwards::from_dir_with_dot_dot ---
thread 'upwards::from_dir_with_dot_dot' panicked at gix-discover\tests\upwards\mod.rs:155:5:
assertion `left == right` failed
  left: Reduced
 right: Full
stack backtrace:
   0: std::panicking::begin_panic_handler
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\std\src\panicking.rs:652
   1: core::panicking::panic_fmt
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\core\src\panicking.rs:72
   2: core::panicking::assert_failed_inner
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\core\src\panicking.rs:409
   3: core::panicking::assert_failed<gix_sec::Trust,gix_sec::Trust>
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081\library\core\src\panicking.rs:364
   4: discover::upwards::from_dir_with_dot_dot
             at .\tests\upwards\mod.rs:155
   5: discover::upwards::from_dir_with_dot_dot::closure$0
             at .\tests\upwards\mod.rs:120
   6: core::ops::function::FnOnce::call_once<discover::upwards::from_dir_with_dot_dot::closure_env$0,tuple$<> >
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081\library\core\src\ops\function.rs:250
   7: core::ops::function::FnOnce::call_once
             at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library\core\src\ops\function.rs:250
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

   Canceling due to test failure
------------
     Summary [   0.145s] 1 test run: 0 passed, 1 failed, 46 skipped
        FAIL [   0.142s] gix-discover::discover upwards::from_dir_with_dot_dot
error: test run failed

Further test output, from before the change made here, can be see in this gist.

Under the change made here, that failure goes away and all tests pass on that system. They also all pass when I run it locally on a system where they had passed before.

I don't recommend merging this yet

Even if no further changes are to be made to the code, and even if no tests other than those currently present need to be added, I still recommend against merging this just yet.

I think it is very easy to introduce security vulnerabilities when making this kind of change, especially since I don't think I yet have a full understanding of all the relevant code. (I think I understand what it is intended to do, and I think it probably does what it is intended to do, but I am not certain.)

There are two possible kinds of vulnerabilities this change could introduce:

Fully new vulnerabilities if there is a problem with the change itself, even if all the other code is free of vulnerabilities.
Latent vulnerabilities that were not really vulnerabilities before, due to the bug preventing them from taking effect, but that could be activated by fixing the bug even if the fix itself is correct.

I'll keep both kinds of possible problems in mind while studying the code further. But I wanted to open this now, mainly to solicit input, but also in case anything unexpected happens on CI.

Jun 27 '24 02:06 EliahKagan

This change, while it fixed the failing tests on one local Windows test machine and preserved all tests passing on another, seems to have caused numerous Windows failures on CI (which are not random; see also this earlier run). I didn't expect that!

Jun 27 '24 03:06 EliahKagan

Thanks so much for bringing this up!

I wish I could say more as to why gitoxide seems to act differently than Git here, but ideally it should act exactly like it.

It seems strange that CI is failing and I wonder if I 'manipulated' it to pass, but I at least from memory I can say that I intended to reproduce what Git does 1:1, while recreating their implementation in Rust.

Jun 27 '24 05:06 Byron

The details of the failures should be illuminating. I'll let you know what I can figure out once I've looked at them in depth.

(Rebasing onto main keeps all the local behavior I described, and also keeps the CI failures I didn't expect, so recent changes on main don't seem to affect this.)

It seems strange that CI is failing and I wonder if I 'manipulated' it to pass, but I at least from memory I can say that I intended to reproduce what Git does 1:1, while recreating their implementation in Rust.

Are you referring to the code as it was in #426, or subsequent changes?

In https://github.com/Byron/gitoxide/pull/426#issuecomment-1134359724, I have realized I don't know what code you meant was contributed by a Microsoft employee. Is that a reference to code in Git for Windows that gitoxide has used as a reference? Or to code in gitoxide itself? The link in that comment is to code in gitoxide, but it looks like that and related code from around the same time (though not all changes in that file) were authored by you.

Jun 27 '24 06:06 EliahKagan

Right, there was also a contribution - Git knows much better than my memory. All I recalled here was the original intent, but it might well be that contributed code changed things slightly and I didn't notice.

In any case, I also think that getting to the bottom of this will be illuminating :).

Jun 27 '24 10:06 Byron

Is this the desired change?

Aug 26 '24 04:08 Byron

I'm not sure yet, and I'm hoping to investigate the situation further before this is merged. Although the tests are passing now, at least on CI, there are two remaining concerns:

I didn't change anything, I just rebased! Some other change, either in gitoxide on the main branch or to some dependency or otherwise software, has made it so this seems to work now. When I figure out what change made this work, that should also give me insights into whether this change is correct.
The originally envisioned blocker still applies, though I have made some progress on it: I want to make sure I am not inadvertently introducing a security vulnerability here, and I have a bit more to investigate on that (plus, to be confident, I should also understand why this seems to be working now, per (1)).

Aug 26 '24 04:08 EliahKagan

I didn't change anything, I just rebased! Some other change, either in gitoxide on the main branch or to some dependency or otherwise software, has made it so this seems to work now. When I figure out what change made this work, that should also give me insights into whether this change is correct.

This is a little concerning then, as just yesterday I accepted a bit PR that swapped cargo test for cargo-nextest. I thinks a great test would be if reverting this one (5208cb9804026833579f86b5282222c67f5ff6ff) has any effect on the tests.

Aug 26 '24 04:08 Byron

Thanks, that's the key. #1556 introduced a CI bug where failing tests on Windows are reported as passing. Fortunately, this bug can be fixed while retaining the full benefits of #1556. I'll open a PR shortly with a fix (and analysis).

Aug 26 '24 05:08 EliahKagan

I've opened #1559 for this. To make sure it really fixes the CI regression, I've rebased this PR onto 4f2ab5b from it, which should cause CI here to fail again (i.e. to correctly report the test failure that was already happening). Assuming #1559 is merged, the extra commit here will go away when [edit: or maybe before?] [edit: no, not before] I rebase this onto main again afterwards.

Aug 26 '24 05:08 EliahKagan

https://github.com/Byron/gitoxide/pull/1429#issuecomment-2194375939

Right, there was also a contribution - Git knows much better than my memory.

Ah, I see. The earlier contribution you had mentioned in https://github.com/Byron/gitoxide/pull/426#issuecomment-1134359724 as likely authoritative was in the form of samples/suggestions in https://github.com/microsoft/windows-rs/issues/1697, which is why commit message author metadata didn't distinguish it.

It was in https://github.com/Byron/gitoxide/commit/84023cbe7dc2e0d79aadd0863122af829e25bbba, which came in shortly after #386. Its commit message cites https://github.com/microsoft/windows-rs/issues/1697#issuecomment-1106704826.

Aug 29 '24 01:08 EliahKagan

Is this something we could close for now and reopen when it can move towards merging?

Right now I am closing all PRs that don't move and that I cannot take over, and this one would be among them. Thanks again for your help with this!

Dec 21 '24 18:12 Byron

Is this something we could close for now and reopen when it can move towards merging?

Yes, I have no objection to this being closed for now.

I had hoped to get to it this month but didn't quite manage. When I do, either it can be reopened or I can make a new PR, whichever seems to make more sense at the time. This is lately very much on my radar, but I don't view that as an argument against closing it, because the underlying investigation is something I know I will not forget, regardless of the status of this PR.

In addition, I am also not sure the fix will include the changes in the form they currently take here.

Dec 21 '24 19:12 EliahKagan

Thank you, this sounds like a plan :)!

Dec 21 '24 19:12 Byron

gitoxide gitoxide copied to clipboard

Fix gix-sec check for administrators group folder ownership

Background

The bug

How this fix works

Comparison to Git

Testing

I don't recommend merging this yet

gitoxide
gitoxide copied to clipboard