trufflehog icon indicating copy to clipboard operation
trufflehog copied to clipboard

errors from client.Do(req) are ignored - leading to potential false negatives

Open nyanshak opened this issue 1 year ago • 18 comments

There is a common pattern in detectors that can lead to undetected false negative results (e.g., result.Verified is set to false even though the issue is valid).

The pattern:

for _, match := range matches {
    ...
    if verify {
        ...
        
        res, err := client.Do(req)
        if err == nil {
            // do all the checks for known false positives and against status codes
        }
        // BUG: in the case where err != nil, the error is ignored
    }

    results = append(results, s1)
}

So in the case where some error has occurred while doing the http request, the error from client.Do(req) is ignored, and the result will have Verified field set to false even though the match may be valid. This may lead to missing valid results and not knowing that an error occurred while scanning that maybe could be retried / re-scanned.

Here are a few examples (though it's in almost every detector):

  • https://github.com/trufflesecurity/trufflehog/blob/main/pkg/detectors/cicero/cicero.go#L56-L67
  • https://github.com/trufflesecurity/trufflehog/blob/main/pkg/detectors/launchdarkly/launchdarkly.go#L55-L66
  • https://github.com/trufflesecurity/trufflehog/blob/main/pkg/detectors/jiratoken/jiratoken.go#L75-L81

The Okta detector appears to do this correctly and would be a good example of how this should be done instead: https://github.com/trufflesecurity/trufflehog/blob/main/pkg/detectors/okta/okta.go#L63-L66. As a suggested fix, replacing the broken pattern with the one by okta detector may be ideal.

(Similarly, Okta detector reports the error in constructing the request where other detectors do not, though I'm not sure if this is as much of an issue: https://github.com/trufflesecurity/trufflehog/blob/main/pkg/detectors/okta/okta.go#L55-L58)

nyanshak avatar Mar 20 '23 17:03 nyanshak

Hey !

I will try to fix this quickly.

Is it relevant to work on still adding the results but, for each, telling the user there has been some errors during verification ? Or logs will be enough ?

manquiche avatar Mar 29 '23 21:03 manquiche

I think you could go a few ways on it, but it's essential that the errors get returned in FromData calls so that callers know to retry (e.g., that they should re-run trufflehog, or call scanner.FromData again).

This could be something like:

var err error
...
for _, idMatch := range idMatches {
    ...
    res, doErr := client.Do(req)
    if doErr != nil {
         // Option 1: join the error but continue processing the rest of the results
        err = errors.Join(err, doErr)
        continue

        // Option 2: immediately fail & return
       return results, err

       // Option 3: (really more of an extension of the others)
       // Add some sort of retry mechanism w/ exponential backoff, and if you fail N
       // times, then you handle as if it was Option 1 / Option 2
    }
    ... do all the rest of the validation ...

}

For my use case, either returning the error immediately with partial results (Option 1 above) or returning as many results as possible with the errors joined (Option 2) are fine. Option 3 / supporting retries might be nice, but makes things more complicated.

nyanshak avatar Mar 30 '23 15:03 nyanshak

Alright ! I will try option 1 for now and see how it goes !

Thank you for the clarification.

manquiche avatar Apr 01 '23 11:04 manquiche

Hey, sorry for the delay.

I tried something and I'd like to have your approval before modifying every other detector.

I have this :

engine.go

results, err := func() ([]detectors.Result, error) {
	ctx, cancel := context.WithTimeout(ctx, time.Second*10)
	defer cancel()
	defer common.Recover(ctx)
	return detector.FromData(ctx, verify, decoded.Data)
}()
if err != nil {
	if errors.Is(err, pkg.ErrVerify) {
		ctx.Logger().Error(err, "could not verify",
			"source_type", decoded.SourceType.String(),
		)
	} else {
		ctx.Logger().Error(err, "could not scan chunk",
			"source_type", decoded.SourceType.String(),
			"metadata", decoded.SourceMetadata,
		)
		continue
	}
}

aws.go

res, err := client.Do(req)
if err != nil {
	verifyError = errors.Join(pkg.ErrVerify, err)
} else {

	if res.StatusCode >= 200 && res.StatusCode < 300 {
		identityInfo := identityRes{}
		err := json.NewDecoder(res.Body).Decode(&identityInfo)
		if err == nil {
			s1.Verified = true
			s1.ExtraData = map[string]string{
				"account": identityInfo.GetCallerIdentityResponse.GetCallerIdentityResult.Account,
				"user_id": identityInfo.GetCallerIdentityResponse.GetCallerIdentityResult.UserID,
				"arn":     identityInfo.GetCallerIdentityResponse.GetCallerIdentityResult.Arn,
			}
		}
		res.Body.Close()
	} else {
		// This function will check false positives for common test words, but also it will make sure the key appears "random" enough to be a real key.
		if detectors.IsKnownFalsePositive(resSecretMatch, detectors.DefaultFalsePositives, true) {
			continue
		}
	}
}

FromData returns verifyError if not any other errors occured (maybe nil)

return awsCustomCleanResults(results), verifyError

With all this, here is the output (it could feel bloated): trufflehog_errors_preview

Is it ok for you or do you have some other ideas please ?

Thanks in advance

manquiche avatar Apr 07 '23 10:04 manquiche

I think overall this is okay with me, but this might be good to get feedback from other consumers of the tool.

One change I might make is to optionally redact the URL from url.Error. I made an example of how to do this: https://go.dev/play/p/b2XLU9k3CXZ. This probably matters less for direct trufflehog usage (since secrets are directly output to stdout). For users using detector.FromData(...), being able to safely log the errors (or documenting that they should be redacted), is likely safer to prevent secrets leaking in logs.

nyanshak avatar Apr 07 '23 15:04 nyanshak

Thanks for the advice, that is completely correct. I will work on that this weekend.

manquiche avatar Apr 08 '23 16:04 manquiche

Last check with you before I update every detectors.

I feel like this is pretty good, is it ok for you ?

stripe.go FromData:

// FromData will find and optionally verify Stripe secrets in a given set of bytes.
func (s Scanner) FromData(ctx context.Context, verify bool, data []byte) (results []detectors.Result, err error) {
	var verifyError error

	dataStr := string(data)

	matches := secretKey.FindAllString(dataStr, -1)

	for _, match := range matches {

		s := detectors.Result{
			DetectorType: detectorspb.DetectorType_Stripe,
			Raw:          []byte(match),
		}

		if verify {

			baseURL := "https://api.stripe.com/v1/charges"

			client := common.SaneHttpClient()

			// test `read_user` scope
			req, err := http.NewRequestWithContext(ctx, "GET", baseURL, nil)
			if err != nil {
				continue
			}
			req.Header.Add("Authorization", fmt.Sprintf("Bearer %s", match))
			req.Header.Add("Content-Type", "application/json")
			res, err := client.Do(req)

			if err != nil {
				urlErr := &url.Error{
					URL: req.URL.Host,
					Op:  req.Method,
					Err: errors.Unwrap(err),
				}
				verifyError = fmt.Errorf("%w: %s", pkg.ErrVerify, urlErr)
			} else {
				res.Body.Close() // The request body is unused.

				if res.StatusCode == http.StatusOK || res.StatusCode == http.StatusForbidden {
					s.Verified = true
				}
			}
		}

		if !s.Verified && detectors.IsKnownFalsePositive(string(s.Raw), detectors.DefaultFalsePositives, true) {
			continue
		}

		results = append(results, s)
	}

	return results, verifyError
}

Output: image

manquiche avatar Apr 09 '23 09:04 manquiche

Thank you for surfacing this issue. I think option 1 that @nyanshak proposed would be preferred. A PR would definitely be appreciated.

dustin-decker avatar Apr 10 '23 11:04 dustin-decker

:eyes: I like the approach of using req.URL.Host here so that you keep at least some minimum info in the error. Not having any indication that the URL was redacted in this error might be slightly misleading, but I think this is infinitely better than (1) completely masking the error and (2) possibly leaking secrets in the errors.

Legend, @ManQuiche :heart:

nyanshak avatar Apr 10 '23 15:04 nyanshak

Thank you for the time guys, PR is incoming.

manquiche avatar Apr 10 '23 16:04 manquiche

Quick update on this : almost everything is done but I need to test thoroughly and make sure I did all right because there are a lot of files to change.

manquiche avatar Apr 19 '23 09:04 manquiche

Thanks @ManQuiche, that's really great.

dustin-decker avatar Apr 20 '23 16:04 dustin-decker

Hey @ManQuiche, any new updates on this change?

dustin-decker avatar May 26 '23 18:05 dustin-decker

Hello @dustin-decker.

I've done almost everything but I can't get the time right now to finish all... I would have liked to test a bit more but I never managed to setup everything.

Is it better for you if I give you my work right now or if I still wait to see if I'm more available in the near future ?

manquiche avatar May 30 '23 08:05 manquiche

Yes, please upload what you have. You can upload a draft PR if you'd like. We can take a look at what it will take to finish it up.

dustin-decker avatar May 30 '23 16:05 dustin-decker

Hello again, I'm updating my branch tomorrow with main and doing a pull request. I'm sorry it's not in the state I wanted to give you guys. I will probably have some time this weekend to do more !

manquiche avatar Jun 06 '23 21:06 manquiche

Hello ! Here is the pull request #1395 .

I'll try to come back as fast as I can.

Have a good day

manquiche avatar Jun 14 '23 12:06 manquiche

related PR: https://github.com/trufflesecurity/trufflehog/pull/2187.

Progress on this issue will continue

zricethezav avatar Feb 16 '24 21:02 zricethezav