trufflehog
trufflehog copied to clipboard
Drain response body before closing
Description:
This adds calls to io.Copy(io.Discard, res.Body)
prior to closing response bodies, as that is apparently beneficial:
# https://pkg.go.dev/net/http#Response
// The default HTTP client's Transport may not
// reuse HTTP/1.x "keep-alive" TCP connections if the Body is
// not read to completion and closed.
It would be interesting to benchmark/profile this and see whether it has an impact. It might also be beneficial to have verification logic in separate functions, per 1923#discussion_r1376535636.
https://old.reddit.com/r/golang/comments/13fphyz/til_go_response_body_must_be_closed_even_if_you/jjynmj7/ https://andrii-kushch.medium.com/is-it-necessary-to-close-the-body-in-the-http-response-object-in-golang-171c44c9394d
Edit: apparently find+replace missed quite a few instances. I'll fix that later, if this seems worthwhile.
$ rg "defer \w+\.Body\.Close\(\)" --no-line-number --no-filename | grep . | wc -l
577
Checklist:
- [ ] Tests passing (
make test-community
)? - [ ] Lint passing (
make lint
this requires golangci-lint)?
Good find, I wasn't aware of this.
I think we'd also want to set MaxConnsPerHost: https://go.dev/src/net/http/transport.go#L211 I think setting it to 1 would be respectful to the service providers?
That means we'd also need to be consistently using the same HTTP client, which we need to ensure for other reasons too (setting user agent and future functionality).
I think we'd also want to set MaxConnsPerHost: https://go.dev/src/net/http/transport.go#L211 I think setting it to 1 would be respectful to the service providers?
A low value between 1-5 is definitely the thoughtful option. I'd be curious how/if it affects performance.
Nice find. We should test the AWS detector specifically with this change, as it's historically shown some utterly bizarre behavior w/r/t response body closing. (There's a big comment in its code explaining things.)
I think we'd also want to set MaxConnsPerHost: https://go.dev/src/net/http/transport.go#L211 I think setting it to 1 would be respectful to the service providers?
A low value between 1-5 is definitely the thoughtful option. I'd be curious how/if it affects performance.
I think there was a time a few months ago where I went down this path and used a single client shared across all the detectors and tweaked the transport values. It appeared to have a small negative impact, but i'm not confident I set it up correctly. I'd also be curious to see how this affects performance when done correctly.
I've been busy and haven't been able to look into benchmarks for this change. Happy to run specific commands and show the results, if someone can supply them.
Regarding changing MaxConnsPerHost
or other HTTP client settings, I am happy to add it to this PR but worry about adverse side effects. This is currently a large but (seemingly) straightforward change.