SyliusResourceBundle icon indicating copy to clipboard operation
SyliusResourceBundle copied to clipboard

Add support for HttpObjectAggregator in server pipeline

Open alexander-yakushev opened this issue 7 years ago • 20 comments

Symmetrical to #393, this PR adds support for HttpObjectAggregator being added to the server pipeline. It also makes possible to solve #452 from the user code. Also, even if #452 is someday fixed by introducing :max-request-body-size option, this PR can be a vessel for it.

Regarding the implementation, notice that we don't release the content ByteBuf immediately unlike with other *HttpContent frames. I tried doing that, and Netty errors with refcount being negative. My understanding is that it is released here together with the whole request.

I also took the liberty of removing an unused import.

alexander-yakushev avatar Feb 11 '19 08:02 alexander-yakushev

I've asked this previously... If we want to add full support for HttpObjectAggregator, maybe we should also add configuration to include HttpObjectAggregator with given settings? In some cases, this might be helpful (rejecting large requests, working with inlined execution instead of response-executor etc).

kachayev avatar Feb 11 '19 11:02 kachayev

I agree, and it looks to me that such functionality could build upon this PR later.

alexander-yakushev avatar Feb 11 '19 12:02 alexander-yakushev

Sure. I was going to work on #452 somewhat later. And #482 needs to be merged before introducing full-fledged support for aggregation, in fact, HttpObjectAggregator handles 100-continue separately.

kachayev avatar Feb 11 '19 12:02 kachayev

Regarding releasing the content... You're right, it will happen here and it somewhat "accidental". The previous implementation relied on the fact, that it releases HttpMessage and the body was already released by the handler. Now we have FullHttpRequest which is HttpMessage + body, meaning that releasing actually frees both of them, so we should not release it earlier. Feels kinda fragile to me, but that's what you have for aggregated messages.

kachayev avatar Feb 11 '19 12:02 kachayev

@alexander-yakushev I think for both #483 and #484 we need to implement the same functionality for raw-* handlers. Otherwise, it might break the contract when setting :raw-stream? true. Happy to update the implementation later, just need to set a reminder.

kachayev avatar Feb 11 '19 13:02 kachayev

I thought about it a little, and to me it doesn't make much sense to combine HttpObjectAggregator with raw-streams?. The latter is for cases when you want to consume the body asynchronously, and HttpObjectAggregator by definition disables asynchronous handling.

I don't mind having it there for "completeness" though, just want to be sure it doesn't add more confusion than value.

alexander-yakushev avatar Feb 11 '19 13:02 alexander-yakushev

You still can use :raw-stream? for example to get access to Netty's ByteBufs. I'm mostly worried about the confusion when switching from ring handler to a raw stream realizing that they process different sets of messages from your pipeline.

kachayev avatar Feb 11 '19 14:02 kachayev

Looked at it again; this now looks more tricky than I thought. If I understand correctly, users are expected to release ByteBufs manually in raw-stream? mode. So, the content of FullHttpRequest shouldn't be released, yet it does get released together with the request in handle-request. We need somehow to release FullHttpRequest without the content. Perhaps, by acquireing the content an extra time? It all starts to look even more wobbly at this point.

alexander-yakushev avatar Feb 11 '19 22:02 alexander-yakushev

@alexander-yakushev Well... we need to call handle-request, put a single message with the content to a stream and close it right after. In such a case we can probably retain content before putting on the stream. In fact, FullHttpRequest just proxies retain/release method calls to a content.

kachayev avatar Feb 11 '19 22:02 kachayev

Added support for HttpObjectAggregator for the raw handler too. Tested only rudimentary, to the best of my ability:

(require '[aleph.http :as http]
         '[manifold.stream :as ms])

(def srv (http/start-server
          (fn [r] (println @(ms/reduce conj [] (:body r))))
          {:port 12345
           :raw-stream? true
           :pipeline-transform #(.addAfter % "http-server" "aggreg"
                                           (io.netty.handler.codec.http.HttpObjectAggregator. 1000))}))

...

;; curl -d '{}' 'localhost:12345'
[#object[io.netty.buffer.CompositeByteBuf 0x5f926fa4 CompositeByteBuf(ridx: 0, widx: 2, cap: 2, components=1)]]

alexander-yakushev avatar Feb 12 '19 21:02 alexander-yakushev

FWIW, this patch looks good to me! I rebased it onto latest master and added a few test cases for it over here: https://github.com/bevuta/aleph/tree/server-aggregator

DerGuteMoritz avatar Jul 30 '22 20:07 DerGuteMoritz

@DerGuteMoritz Thanks for tackling this, and thanks for the tests! It seems ok, but we need to double-check a lot of this old code.

  • How does it interact with #483, as they mentioned above?
  • Also, is it releasing all the necessary bytebufs correctly? Check with ResourceLeakDetector.setLevel() or -Dio.netty.leakDetectionLevel=PARANOID
  • What's the error-handling situation look like?

KingMob avatar Aug 03 '22 06:08 KingMob

The discussion above where this PR is referenced only did so to consider the :raw-stream? case in that one, too. That case was subsequently addressed in this here PR by increasing the reference count before handing it over via the stream so that the contract that the individual ByteBufs must be released by the consumer is upheld. This looks like a reasonable solution to me and appears to work as intended.

  • Also, is it releasing all the necessary bytebufs correctly? Check with ResourceLeakDetector.setLevel() or -Dio.netty.leakDetectionLevel=PARANOID

This should be covered by means of having the test cases, right? Note https://github.com/clj-commons/aleph/blob/master/test/aleph/http_test.clj#L29

  • What's the error-handling situation look like?

Should be fine as is. A FullHttpRequest is essentially a combination of a HttpRequest and a LastHttpContent, so the patch basically just does the same thing as for these individual pieces in one go (save for the extra reference count increment mentioned above). To be sure, I'll add a test case which intentionally throws an exception from the handler, though. Sound good?

DerGuteMoritz avatar Aug 03 '22 12:08 DerGuteMoritz

Hello @KingMob!

  1. #483 is independent of this because that PR fixes the HTTP client behavior, and this PR is about the aggregator in the server code.
  2. I can't properly check it right now, but this code has been running in different heavy-loaded scenarios for almost 3 years, no issues detected.

alexander-yakushev avatar Aug 03 '22 12:08 alexander-yakushev

The only thing that bother me a bit is the fact we don't have any tests with a pipeline containing HttpObjectAggregator. (only for the websocket)

arnaudgeiser avatar Aug 03 '22 22:08 arnaudgeiser

This should be covered by means of having the test cases, right? Note https://github.com/clj-commons/aleph/blob/master/test/aleph/http_test.clj#L29

Welp, maybe not: I am not able to consistently trigger the leak detector even when I intentionally leak buffers in :raw-stream? true handlers with paranoid mode enabled as per the cited line. Only about 1 in 20 runs ends up triggering the leak detector. Also tried adding (System/gc) and (System/runFinalization) to the end of the test but no dice. This is true even when HttpObjectAggregator is not involved.

FTR: I noticed this when trying to add a test case with an intentionally throwing handler: AFAICT, in the :raw-stream? true case, such an unhandled exception would always lead to a leak since the default error handler doesn't care about releasing the bytebufs in the request body stream. This is of course an orthogonal problem which I will file a separate issue about. But not quite sure yet how to go about this with the leak detector being so unreliable :disappointed: Any hints appreciated.

DerGuteMoritz avatar Aug 05 '22 11:08 DerGuteMoritz

The only thing that bother me a bit is the fact we don't have any tests with a pipeline containing HttpObjectAggregator. (only for the websocket)

I added some in my rebased version of this branch: https://github.com/bevuta/aleph/commit/a4d03a2f736121ddccd0aaac5bca2f597d51366a

@alexander-yakushev Feel free to update this PR to include my changes. If you'd rather not be bothered, I can also open a fresh one based on my branch!

DerGuteMoritz avatar Aug 05 '22 12:08 DerGuteMoritz

FTR: I noticed this when trying to add a test case with an intentionally throwing handler: AFAICT, in the :raw-stream? true case, such an unhandled exception would always lead to a leak since the default error handler doesn't care about releasing the bytebufs in the request body stream. This is of course an orthogonal problem which I will file a separate issue about. But not quite sure yet how to go about this with the leak detector being so unreliable disappointed Any hints appreciated.

This is unfortunately too late for Aleph to take care of this. When using :raw-stream?, you have to be aware that some of your Netty ByteBuf on your body (manifold.stream) will never reach your application code in case of Exception. That's your responsibility to catch it, close the stream and drain it to apply .release.

This would be a breaking change with huge consequences. As soon as you are using io.netty.buffer.PooledByteBufAllocator (and thus io.netty.buffer.PooledByteBuf) the last call to .release will bring back the ByteBuf to the pool which can be reused by something else. In the worst-case scenario, we might end up calling .release twice on the same ByteBuf where the second call could release something that does not belong to us anymore.

To me, the only thing we can do here is better documentation.

arnaudgeiser avatar Aug 05 '22 12:08 arnaudgeiser

Rebased and cherry-picked tests commit off @DerGuteMoritz's branch.

alexander-yakushev avatar Aug 05 '22 16:08 alexander-yakushev

Whether @KingMob is fine with the state of this PR, I would merge it.

arnaudgeiser avatar Aug 08 '22 19:08 arnaudgeiser

Thank you @alexander-yakushev and @DerGuteMoritz for the hard work! Let's say it's never too late :)

arnaudgeiser avatar Aug 15 '22 11:08 arnaudgeiser