trufflehog icon indicating copy to clipboard operation
trufflehog copied to clipboard

Intermittent `signal: broken pipe` error from Git.handleBinary

Open rgmz opened this issue 5 months ago • 5 comments

Please review the Community Note before submitting

TruffleHog Version

3.67.5

Trace Output

2024-02-11T17:39:50-05:00       error   trufflehog      error reading chunk     {"source_manager_worker_id": "sIEwQ", "repo": "https://github.com/matryer/xbar.git", "commit": "2c5f306", "path": "archive/bitbar/App/Vendor/Sparkle/Tests/Resources/SparkleTestCodeSignApp.dmg", "timeout": 30, "error": "bzip2: corrupted input: invalid stream magic"}
2024-02-11T17:39:50-05:00       error   trufflehog      error reading chunk     {"source_manager_worker_id": "sIEwQ", "repo": "https://github.com/matryer/xbar.git", "commit": "2c5f306", "path": "archive/bitbar/App/Vendor/Sparkle/Tests/Resources/SparkleTestCodeSignApp.dmg", "timeout": 30, "error": "bzip2: corrupted input: invalid stream magic"}
2024-02-11T17:39:50-05:00       error   trufflehog      error unarchiving chunk.        {"source_manager_worker_id": "sIEwQ", "repo": "https://github.com/matryer/xbar.git", "commit": "2c5f306", "path": "archive/bitbar/App/Vendor/Sparkle/Tests/Resources/SparkleTestCodeSignApp.tar", "timeout": 30, "error": "archive/tar: invalid tar header"}
2024-02-11T17:39:50-05:00       error   trufflehog      error waiting for command       {"source_manager_worker_id": "sIEwQ", "repo": "https://github.com/matryer/xbar.git", "command": "/usr/bin/git -C /tmp/trufflehog-169189-3577558288/.git cat-file blob 2c5f3063aa5d4f18c0baeaed6d5fe048b38731a4:archive/bitbar/App/Vendor/Sparkle/Tests/Resources/SparkleTestCodeSignApp.tar", "stderr": "", "commit": "2c5f3063aa5d4f18c0baeaed6d5fe048b38731a4", "error": "signal: broken pipe"}

This often seems to be preceded by archive errors, but not always. I'm not sure what to make of that, as I haven't looked into the code path yet.

Expected Behavior

Binary files are reliably read from Git repos.

Actual Behavior

Reading binary files from Git repos intermittently fails due to a signal: broken pipe when executing git cat-file.

2024-02-11T17:39:50-05:00       error   trufflehog      error waiting for command       {"source_manager_worker_id": "sIEwQ", "repo": "https://github.com/matryer/xbar.git", "command": "/usr/bin/git -C /tmp/trufflehog-169189-3577558288/.git cat-file blob 2c5f3063aa5d4f18c0baeaed6d5fe048b38731a4:archive/bitbar/App/Vendor/Sparkle/Tests/Resources/SparkleTestCodeSignApp.tar", "stderr": "", "commit": "2c5f3063aa5d4f18c0baeaed6d5fe048b38731a4", "error": "signal: broken pipe"}

https://github.com/trufflesecurity/trufflehog/blob/f35185e2152d562a62d07031288fbd03b13ceeea/pkg/sources/git/git.go#L1168-L1193

Steps to Reproduce

Unsure; it does not consistently happen.

Environment

Windows 10, WSL PRETTY_NAME="Ubuntu 22.04.3 LTS" NAME="Ubuntu"

Additional Context

PR #2174 replaced go-git with direct calls to Git. While this fixed memory consumption issues, it seems that more work is required to stabilize the change.

References

N/A

rgmz avatar Feb 11 '24 22:02 rgmz

I have a new GitHub repo which is failing 100% with "error":"signal: broken pipe" when reading a .tar file which I believe is being broken into several files. This seems similar to this issue #2419

dwilliamsstc avatar Apr 15 '24 20:04 dwilliamsstc

@ahrav does this feel familiar to anything you've been looking at recently?

rosecodym avatar Apr 22 '24 15:04 rosecodym

It sure does. @dwilliamsstc would that repo happen to be public, so that we can use it to test against? I think this PR should address the issue, but i'm not 100% certain just yet.

ahrav avatar Apr 22 '24 15:04 ahrav

@ahrav The issue isn't specific to any repo or file — as far as I can tell. For me, some scans encounter no issues, whereas other scans have the issue on ever file.

rgmz avatar Apr 24 '24 14:04 rgmz

@ahrav - sorry that repo having problems is a private one

dwilliamsstc avatar May 01 '24 19:05 dwilliamsstc