commons-io icon indicating copy to clipboard operation
commons-io copied to clipboard

[SECURITY] Fix Partial Path Traversal Vulnerability

Open JLLeitschuh opened this issue 3 years ago • 11 comments

Security Vulnerability Fix

This pull request fixes a partial-path traversal vulnerability due to an insufficient path traversal guard.

Even if you deem, as the maintainer of this project, this is not necessarily fixing a security vulnerability, it is still a valid security hardening.

Preamble

Impact

This issue allows a malicious actor to potentially break out of the expected directory. The impact is limited to sibling directories. For example, userControlled.getCanonicalPath().startsWith("/usr/out") will allow an attacker to access a directory with a name like /usr/outnot.

Why?

To demonstrate this vulnerability, consider "/usr/outnot".startsWith("/usr/out"). The check is bypassed although /outnot is not under the /out directory. It's important to understand that the terminating slash may be removed when using various String representations of the File object. For example, on Linux, println(new File("/var")) will print /var, but println(new File("/var", "/") will print /var/; however, println(new File("/var", "/").getCanonicalPath()) will print /var.

The Fix

Comparing paths with the java.nio.files.Path#startsWith will adequately protect againts this vulnerability.

For example: file.getCanonicalFile().toPath().startsWith(BASE_DIRECTORY) or file.getCanonicalFile().toPath().startsWith(BASE_DIRECTORY_FILE.getCanonicalFile().toPath())

Other Examples

:arrow_right: Vulnerability Disclosure :arrow_left:

:wave: Vulnerability disclosure is a super important part of the vulnerability handling process and should not be skipped! This may be completely new to you, and that's okay, I'm here to assist!

First question, do we need to perform vulnerability disclosure? It depends!

  1. Is the vulnerable code only in tests or example code? No disclosure required!
  2. Is the vulnerable code in code shipped to your end users? Vulnerability disclosure is probably required!

Vulnerability Disclosure How-To

You have a few options options to perform vulnerability disclosure. However, I'd like to suggest the following 2 options:

  1. Request a CVE number from GitHub by creating a repository-level GitHub Security Advisory. This has the advantage that, if you provide sufficient information, GitHub will automatically generate Dependabot alerts for your downstream consumers, resolving this vulnerability more quickly.
  2. Reach out to the team at Snyk to assist with CVE issuance. They can be reached at the Snyk's Disclosure Email. Note: Please include JLLeitschuh Disclosure in the subject of your email so it is not missed.

Detecting this and Future Vulnerabilities

You can automatically detect future vulnerabilities like this by enabling the free (for open-source) GitHub Action.

I'm not an employee of GitHub, I'm simply an open-source security researcher.

Source

This contribution was automatically generated with an OpenRewrite refactoring recipe, which was lovingly hand crafted to bring this security fix to your repository.

The source code that generated this PR can be found here: PartialPathTraversalVulnerability

Why didn't you disclose privately (ie. coordinated disclosure)?

This PR was automatically generated, in-bulk, and sent to this project as well as many others, all at the same time.

This is technically what is called a "Full Disclosure" in vulnerability disclosure, and I agree it's less than ideal. If GitHub offered a way to create private pull requests to submit pull requests, I'd leverage it, but that infrastructure, sadly, doesn't exist yet.

The problem is that as an open source software security researcher, I (exactly like open source maintainers), I only have so much time in a day. I'm able to find vulnerabilities impacting hundreds, or sometimes thousands of open source projects with tools like GitHub Code Search and CodeQL. The problem is that my knowledge of vulnerabilities doesn't scale very well.

Individualized vulnerability disclosure takes time and care. It's a long and tedious process, and I have a significant amount of experience with it (I have over 50 CVEs to my name). Even tracking down the reporting channel (email, Jira, ect..) can take time and isn't automatable. Unfortunately, when facing prblems of this scale, individual reporting doesn't work well either.

Additionally, if I just spam out emails or issues, I'll just overwhelm already over taxed maintainers, I don't want to do this either.

By creating a pull request, I am aiming to provide maintainers something highly actionable to actually fix the identified vulnerability; a pull request.

There's a larger discussion on this topic that can be found here: https://github.com/JLLeitschuh/security-research/discussions/12

Opting-Out

If you'd like to opt-out of future automated security vulnerability fixes like this, please consider adding a file called .github/GH-ROBOTS.txt to your repository with the line:

User-agent: JLLeitschuh/security-research
Disallow: *

This bot will respect the ROBOTS.txt format for future contributions.

Alternatively, if this project is no longer actively maintained, consider archiving the repository.

CLA Requirements

This section is only relevant if your project requires contributors to sign a Contributor License Agreement (CLA) for external contributions.

It is unlikely that I'll be able to directly sign CLAs. However, all contributed commits are already automatically signed-off.

The meaning of a signoff depends on the project, but it typically certifies that committer has the rights to submit this work under the same license and agrees to a Developer Certificate of Origin (see https://developercertificate.org/ for more information).

- Git Commit Signoff documentation

If signing your organization's CLA is a strict-requirement for merging this contribution, please feel free to close this PR.

Sponsorship & Support

This contribution is sponsored by HUMAN Security Inc. and the new Dan Kaminsky Fellowship, a fellowship created to celebrate Dan's memory and legacy by funding open-source work that makes the world a better (and more secure) place.

This PR was generated by Moderne, a free-for-open source SaaS offering that uses format-preserving AST transformations to fix bugs, standardize code style, apply best practices, migrate library versions, and fix common security vulnerabilities at scale.

Tracking

All PR's generated as part of this fix are tracked here: https://github.com/JLLeitschuh/security-research/issues/13

JLLeitschuh avatar Jul 29 '22 13:07 JLLeitschuh

This PR needs a test that fails without the main change, otherwise it's just one commit away from a regression.

garydgregory avatar Jul 29 '22 14:07 garydgregory

This PR needs a test that fails without the main change, otherwise it's just one commit away from a regression.

I agree! The code generator used to generate this PR doesn't have the capabilities to also generate tests. Any of the maintainers of this project are more than welcome to push additional changes to this branch to add tests.

JLLeitschuh avatar Jul 29 '22 16:07 JLLeitschuh

Uh? You created the PR... you can create the test. Don't blame the tool you chose.

garydgregory avatar Jul 29 '22 18:07 garydgregory

Uh? You created the PR...

I used a bot to create the PR on my behalf, as well as generate the fix, as well as generated fixes for 30 other OSS projects with this same bot. I'm currently working through a collection of other vulnerabilities I'm aiming to fix including generating 108 pull request to fix a bunch of OSS projects that are vulnerable to Zip-Slip.

you can create the test.

Unfortunately, this is not something I have time for right now. However, I believe this may fix a security vulnerability, do you agree? If so, consider this a security report from a security researcher, and consider treating it as such, following the Apache standard vulnerability handling profess. If you don't consider this a security vulnerability, and are unwilling to accept this change without a test (which is completely understandable) please feel free to close this.

JLLeitschuh avatar Jul 29 '22 18:07 JLLeitschuh

Well, we are all busy people, most of us here volunteers to boot; without a test I am -1 to merge, I think this should go on the queue for study and hopefully someone will take the time to create a test.i might even look over the weekend.

garydgregory avatar Jul 29 '22 20:07 garydgregory

@ JLLeitschuh I think I understand the time limitation you have to work privately with organizations, and also to work on tests. What about integrating your tool with another existing tool like Google OSS Fuzz? I am not sure if your tool is a fuzzer, or a static or runtime analyzer that tries to use existing exploits against code bases. Maybe there are other tools like Sonarcloud that also support these analyzers.

The advantage of integrating with a tool like OSS Fuzz, for instance, is that they already have a workflow that is used by organizations like ASF, and a process for private disclosure of issues.

Otherwise, even though I believe fixing these issues is beneficial to users, in the end your pull requests may end up stalled for a long time until a volunteer has time to reproduce, add tests, prepare the CVE/disclosure, etc. And it wouldn't be doing any benefit to users or maintainers in this case.

Cheers -Bruno

kinow avatar Jul 29 '22 22:07 kinow

Codecov Report

Merging #371 (8065a17) into master (857a8a2) will decrease coverage by 0.01%. The diff coverage is 100.00%.

@@             Coverage Diff              @@
##             master     #371      +/-   ##
============================================
- Coverage     85.97%   85.95%   -0.02%     
+ Complexity     3089     3088       -1     
============================================
  Files           200      200              
  Lines          7330     7329       -1     
  Branches        905      905              
============================================
- Hits           6302     6300       -2     
  Misses          778      778              
- Partials        250      251       +1     
Impacted Files Coverage Δ
src/main/java/org/apache/commons/io/FileUtils.java 94.49% <100.00%> (-0.01%) :arrow_down:
.../main/java/org/apache/commons/io/input/Tailer.java 85.57% <0.00%> (-0.50%) :arrow_down:

Help us with your feedback. Take ten seconds to tell us how you rate us.

codecov-commenter avatar Jul 29 '22 22:07 codecov-commenter

I am not sure if your tool is a fuzzer, or a static or runtime analyzer that tries to use existing exploits against code bases.

OpenRewrite is a Format Preserving, Abstract Syntax Tree Transformer. Basically, it's performing static code analysis to detect the vulnerability, then creating a fix by transforming the code to fix the vulnerability.

There are Gradle & Maven plugins available: https://docs.openrewrite.org/reference

JLLeitschuh avatar Jul 30 '22 03:07 JLLeitschuh

@JLLeitschuh I am sure you are well-intentioned but the path you chose (arg, pun) is not the best one for the community, please take the time to read https://commons.apache.org/security.html, specifically, an issue like this one should be reported to the security mailing list before being published in the open.

garydgregory avatar Jul 30 '22 11:07 garydgregory

I am sure you are well-intentioned but the path you chose (arg, pun) is not the best one for the community,

I do appreciate the pun 😂

please take the time to read https://commons.apache.org/security.html, specifically, an issue like this one should be reported to the security mailing list before being published in the open.

Although I had not read the Apache Commons Security disclosure page, I assure you I am very familiar with the Apache Software Foundation's vulnerability disclosure process. I've disclosed numerous vulnerabilities to the ASF in the past. I also have over 50 CVEs to my name.

The problem that I encounter is a matter of scale. I can find tens, hundreds, or sometimes thousands of vulnerabilities in OSS with tools like CodeQL, and GitHub Code Search. The problem that I face is how to actually get those vulnerabilities fixed. I have actually attempted private disclosure to organizations when I encounterd a large scale security vulnerability in the past. It took me months of work (example).

Reporting these vulnerabilities one at a time is impractical due to the scale of the problem. I believe that, when compared with other options, full disclosure with a fix is far superior no disclosure at all.

JLLeitschuh avatar Jul 30 '22 11:07 JLLeitschuh

Otherwise, even though I believe fixing these issues is beneficial to users, in the end your pull requests may end up stalled for a long time until a volunteer has time to reproduce, add tests, prepare the CVE/disclosure, etc. And it wouldn't be doing any benefit to users or maintainers in this case.

This is an unfortunate trade off of this methodology. I don't have any control over the maintainer vulnerability handling timeline. I agree that, the net benifit when looking at this on a project-by-project basis is less than ideal. But when looked at at scale, the value proposition to the entire internet becomes clearer.

I can relate to the plight of the OSS maintainer. I am one myself, although not on one with quite as many users as this project.

JLLeitschuh avatar Jul 31 '22 23:07 JLLeitschuh

This change affects code that tries to exclude some files when copying a directory (to avoid https://issues.apache.org/jira/browse/IO-141). I can see how the change can help to avoid stopping some files being added to the exclusion list when they shouldn't. I tried to take the existing FileUtilsTest for IO-141 (testCopyDirectoryToChild) and to add one based on it but where the source dir is not really a parent dir of the dest dir but the existing check mistakenly thinks it is. I have such a test (below) but it works with and without this PR change.

I can't see how this 'issue' can be exploited by a malicious user, even if the PR change is a small improvement.

    /**
     * Test for https://github.com/apache/commons-io/pull/371. The dir name 'par' is a substring of
     * the dir name 'parent' which is the parent of the 'parent/child' dir.
     */
    @Test
    public void testCopyDirectoryWithPotentialFalsePartialMatch() throws Exception {
        final File grandParentDir = new File(tempDirFile, "grandparent");
        final File parentDir = new File(grandParentDir, "parent");
        final File parDir = new File(grandParentDir, "par");
        final File childDir = new File(parentDir, "child");
        createFilesForTestCopyDirectory(grandParentDir, parDir, childDir);

        final List<File> initFiles = LIST_WALKER.list(grandParentDir);
        final List<File> parFiles = LIST_WALKER.list(parDir);
        final long expectedCount = initFiles.size() + parFiles.size();
        final long expectedSize = FileUtils.sizeOfDirectory(grandParentDir) + FileUtils.sizeOfDirectory(parDir);
        FileUtils.copyDirectory(parDir, childDir);
        final List<File> latestFiles = LIST_WALKER.list(grandParentDir);
        assertEquals(expectedCount, latestFiles.size());
        assertEquals(expectedSize, FileUtils.sizeOfDirectory(grandParentDir));
        assertTrue(expectedCount > 0, "Count > 0");
        assertTrue(expectedSize > 0, "Size > 0");
        Set<String> initFilePaths = getFilePathSet(initFiles);
        Set<String> newFilePaths = getFilePathSet(latestFiles);
        newFilePaths.removeAll(initFilePaths);
        assertEquals(parFiles.size(), newFilePaths.size());
    }

    private Set<String> getFilePathSet(List<File> files) {
        return files.stream().map(f -> {
            try {
                return f.getCanonicalPath();
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
        }).collect(Collectors.toSet());
    }

pjfanning avatar Aug 26 '22 20:08 pjfanning

Hi All:

I added the above (slightly modified https://github.com/apache/commons-io/pull/371#issuecomment-1228927498) test to git master.

I am still -1 to merge this PR until a test is created to prove it indeed fixes something

garydgregory avatar Aug 29 '22 13:08 garydgregory