logbook icon indicating copy to clipboard operation
logbook copied to clipboard

Possible OOM with large servlet response

Open wintermute0 opened this issue 1 year ago • 7 comments

Hi,

I'm reading the source code and just wondering that why there is no size limit when writing to the branched OutputStream? What if there is a really large HTTP response? Then the whole thing will be stuck inside a ByteArrayOutputStream which will easily use up all your memory. Also there is a configuration property called "logbook.write.max-body-size". You would think that it should be able to put an upper limit to the memory footprint no matter the size of the actual HTTP response body, but looks like it did not.

https://github.com/zalando/logbook/blob/af6f818f4fcc17167f2508abe9eb8f81eb8148b8/logbook-servlet/src/main/java/org/zalando/logbook/servlet/LocalResponse.java#L285

wintermute0 avatar Jan 31 '24 08:01 wintermute0

You have a point here, as the response is copied in LocalResponse classes, the allocation of memory is doubled. We could use the max-body-size to enforce a limit, let's say, counting the worst case scenario, that 4 bytes are used per character. I'm not sure how big of a risk this is. Who would enable response body logs for cases when it can be that big? Even truncation of the body doesn't make sense here.

kasmarian avatar Feb 01 '24 13:02 kasmarian

Well, here is our story.

We use logbook in a legacy Spring 4.x system and we have a LogFilter for URL pattern "/*". We also have a max-body-size in place. Everything was fine for a couple of years, util a recent deployment. A file downloading functionality was added by one of our team members. And then with some users downloading files > 2GB, our server just exploded :(

I think the real risk is that the "max-body-size" configuration made everybody thought that it should be safe to Filter /*. But actually it's not.

wintermute0 avatar Feb 02 '24 08:02 wintermute0

After some internal discussions with the team, it doesn't look like we'll be able to address this fully without changing how Logbook works entirely. The problem is that Logbook buffers the requests and responses not only in servlets implementation, but for all the HTTP clients and other server types.

~Another problem is that Logbook predicates evaluate the request/response already after this buffering is made, as the request/response are converted to the internal types at the same time. So you can't exclude the paths, where you send the big payloads from the buffering path, really.~

What you could potentially do for your case for now is just excluding the paths that are using the heavy payload from Logbook (probably you already did that) ~writing your own Filter implementation, where you exclude certain paths before calling Logbook.~

kasmarian avatar Feb 18 '24 11:02 kasmarian

Are you sure about the predicate already operating on a buffered request? It used to be the case that predicates run before buffering and are therefore cheap. There are defaults to exclude binary content types, iirc.

On Sun, Feb 18, 2024, 12:41 Karen Asmarian @.***> wrote:

After some internal discussions with the team, it doesn't look like we'll be able to address this fully without changing how Logbook works entirely. The problem is that Logbook buffers the requests and responses not only in servlets implementation, but for all the HTTP clients and other server types.

Another problem is that Logbook predicates evaluate the request/response already after this buffering is made, as the request/response are converted to the internal types at the same time. So you can't exclude the paths, where you send the big payloads from the buffering path, really.

What you could potentially do for your case for now is writing your own Filter implementation, where you exclude certain paths before calling Logbook.

— Reply to this email directly, view it on GitHub https://github.com/zalando/logbook/issues/1754#issuecomment-1951212919, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADI7HOYZB5IZEDXJYFPSU3YUHSFDAVCNFSM6AAAAABCSTX462VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNJRGIYTEOJRHE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

whiskeysierra avatar Feb 18 '24 11:02 whiskeysierra

Ah, @whiskeysierra is right, the buffering will be used only if the request is eligible for the logging.

Do you think it would help if we mention that logbook.write.max-body-size only affect the size of the output, but does not prevent the buffering of the body?

Another possible suggestion for @wintermute0 I can think of is to define a custom Strategy, if you still want to log response when customers download files, but to not log the body.

As per the defaults to exclude binary content types, as far as I can tell, FilteredHttpResponse still retrieves the body from the org.zalando.logbook.HttpResponse when getBodyAsString is called. So the buffering would still happen, even through the body will not be logged, if the ResponseFilter is configured to skip the body.

kasmarian avatar Feb 18 '24 14:02 kasmarian

I think it would definitely help to mention the buffering behavior in the documentation for logbook.write.max-body-size. As for our particular use case we have already found a way to exclude the download request from logging. Thanks for your help :)

wintermute0 avatar Feb 19 '24 01:02 wintermute0