caddy icon indicating copy to clipboard operation
caddy copied to clipboard

Feature request: Log request size

Open obrienmd opened this issue 3 years ago • 9 comments

This may of course be a naive request, given streaming may make it more difficult than is obvious. I'd love to be able to log size (in bytes, like the response size) of incoming requests, and I don't see that in the access log JSON object.

e.g.

  1. Client POSTS to /foo/bar with a few fields (say, 1600 bytes total).
  2. "size": 1600 is included in the "request" key of the access log JSON object.

The motivation for this is to simplify debugging when I believe a client is sending a body that doesn't match its Content-Length header, so we can rule out our beloved reverse proxy causing the issue :)

Thanks!

obrienmd avatar Dec 22 '21 23:12 obrienmd

Based on my read (and my results in testing), that's the response size.

On Tue, Dec 28, 2021 at 8:58 AM Conor Evans @.***> wrote:

I had a look at this -- is this feature not already in place? https://github.com/caddyserver/caddy/blob/master/modules/caddyhttp/server.go#L196 .

caddy run

2021/12/28 17:04:42.213 ←[34mINFO←[0m using adjacent Caddyfile

...

2021/12/28 17:04:42.242 ←[34mINFO←[0m serving initial configuration

2021/12/28 17:04:50.598 ←[34mINFO←[0m http.log.access.log0 handled request {"request": {"remote_ip": "::1", "remote_port": "54405", "proto": "HTTP/1.1", "method": "GET", "host": "localhost:2015", "uri": "/", "headers": {"User-Agent": ["curl/7.55.1"], "Accept": ["/"]}}, "user_id": "", "duration": 0.0007042, "size": 13, "status": 200, "resp_headers": {"Server": ["Caddy"], "Content-Type": []}}

Is shouldLogRequest() false in your context? https://github.com/caddyserver/caddy/blob/master/modules/caddyhttp/server.go#L473

— Reply to this email directly, view it on GitHub https://github.com/caddyserver/caddy/issues/4494#issuecomment-1002197209, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAC2L57WSPRUYXYCBBSQLWTUTHUE5ANCNFSM5KTR256A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

obrienmd avatar Dec 28 '21 17:12 obrienmd

You're right that streaming makes it trickier.

We might be able to wrap the request reader to count the amount of bytes read and store the total in the request context.

That would mean that the request size might be zero if the request body is ignored or not read, depending on how the request is handled.

Here's an example of how it would look: https://stackoverflow.com/a/53402924

francislavoie avatar Dec 28 '21 17:12 francislavoie

yep @obrienmd realized that a moment too late 😅 (I removed my comment)

conorevans avatar Dec 28 '21 17:12 conorevans

Wrapping the request's body's reader with something that counts is probably the way to go about it. Similar to how we set the request body here: https://github.com/caddyserver/caddy/blob/master/modules/caddyhttp/requestbody/requestbody.go.

mholt avatar Dec 28 '21 18:12 mholt

Hello,

I'm sorry to bump this ticket, but is there a recommended workaround while this feature gets written?

The logs currently do expose a "size" and "res_headers.content-length" keys but, in my tests, both of them do always return the same values, which is the response size and not the request size.

The Prometheus metrics do expose a caddy_http_request_size_bytes, so it is somehow possible to get that information, but I would prefer not to manage a Prometheus instance. And pushing that metric to another metrics system (like Cloudwatch) do not help since it's an histogram metric and is not really usable elsewhere than Prometheus/Grafana, as far as I can tell (and, finally, those metrics are "server wide" while having the request size printed in the logs would allow a finer granularity).

Is there already a plugin that could help with that?

I can ask that question on the forum if it would be more suited.

cc @mholt @francislavoie

simon-lejeune avatar Sep 13 '22 12:09 simon-lejeune

@simon-lejeune The prometheus metric is taking an "educated guess" at the request size by trusting the Content-Length and adding an approximate count of bytes based on headers. Is an estimation good enough for you? Even if the Content-Length header set by the client is an outright lie?

FWIW, the logs already do write the request's Content-Length header... can you use that?

mholt avatar Sep 13 '22 15:09 mholt

@mholt thank you a lot for the quick answer.

I indeed somehow missed the "request.content-length" key, sorry about that. I think I can work with that while caddy implements a more trustable equivalent of nginx's $request_length, thank you.

On that topic, do you think it's doable in a Caddy plugin? By mixing the current Prometheus educated guess implementation with io.Copy on the request's body and finally adding the sum of all of that in a new key in default json logger? This implementation would have a performance cost obviously (especially versus your idea here), but maybe could work as a stop gap while Caddy bundles that option, and is also an interesting exercice.

simon-lejeune avatar Sep 13 '22 16:09 simon-lejeune

@simon-lejeune Absolutely can be done in a plugin, as all of Caddy's features are plugins too :)

It won't be able to change Caddy's access logs, but you can emit your own log from the plugin!

mholt avatar Sep 13 '22 16:09 mholt

@mholt great, thank you for those insights

simon-lejeune avatar Sep 13 '22 16:09 simon-lejeune

Can I work on this issue?

tars-a avatar Mar 05 '23 13:03 tars-a

Yep! PRs welcome.

francislavoie avatar Mar 05 '23 13:03 francislavoie