unit HTTP: request chunked body support.

A patch for feature demonstration.

Note:

This patch assumes that the chunked body and header are received all at once.

May 14 '24 10:05 hongzhidao

fixes: #445

May 20 '24 21:05 avahahn

Here's a patch that fixes the formatting nits:

diff --git a/src/nxt_h1proto.c b/src/nxt_h1proto.c
index 6550efb9..16f6dcaa 100644
--- a/src/nxt_h1proto.c
+++ b/src/nxt_h1proto.c
@@ -893,10 +893,8 @@ nxt_h1p_request_body_read(nxt_task_t *task, nxt_http_request_t *r)
 
         nxt_buf_t *out = nxt_http_chunk_parse(task, &h1p->chunked_parse, in);
 
-        if (h1p->chunked_parse.chunk_error ||
- h1p->chunked_parse.error) {
-            status = NXT_HTTP_NOT_IMPLEMENTED
-;
+        if (h1p->chunked_parse.chunk_error || h1p->chunked_parse.error) {
+            status = NXT_HTTP_NOT_IMPLEMENTED;
             goto error;
         }
 
@@ -904,8 +902,7 @@ nxt_h1p_request_body_read(nxt_task_t *task, nxt_http_request_t *r)
             out->next = nxt_buf_sync_alloc(r->mem_pool, NXT_BUF_SYNC_LAST);
             out = nxt_h1p_chunk_create(task, r, out);
             r->body = out;
-            r->state->ready_handler(task, r,
-NULL);
+            r->state->ready_handler(task, r, NULL);
             return;
         }
     }

Can we get this PR converted to a non-draft PR? I think we should push this in as a solution for chunked transfer encoded requests.

May 20 '24 22:05 avahahn

Can we get this PR converted to a non-draft PR?

Does non-draft PR mean it's ready to review? And it will run with GH actions for tests?

May 21 '24 01:05 hongzhidao

Can we get this PR converted to a non-draft PR?

Does non-draft PR mean it's ready to review? And it will run with GH actions for tests?

I generally regard a non-draft PR as reviewable (whether I've been assigned to review or not), sometimes I'll do a rough review of a draft PR depending on circumstances...

Draft and non-draft PR's both get run through the GH actions, of which you can generally ignore the Ruby 3.2 failures....

May 21 '24 01:05 ac000

Good to know, thanks. Btw, the way to support the chunked body should be different from the patch. The final solution is not clear yet for me.

May 21 '24 02:05 hongzhidao

Btw, the way to support the chunked body should be different from the patch. The final solution is not clear yet for me.

Can you say more about how this approach doesn't feel quite right to you?

We need to land something ASAP, and should act as though the deadline to have a mergeable / testable branch is the end of this week.

Is the current approach good enough for that, or do you think this approach is so flawed that you would block merging?

May 21 '24 18:05 callahad

We need to land something ASAP, and should act as though the deadline

Actually, I don't think we do, not in the main Unit repository anyway...

to have a mergeable / testable branch is the end of this week.

There is nothing preventing you from having a branch in whoevers personal clone for testing purposes or even for actual use until a correct solution is merged upstream...

May 21 '24 19:05 ac000

We have an internal opportunity with a constraint that requires landing functional (but not perfect) support for Transfer-Encoding: chunked in master this calendar month.

Rightly or wrongly, missing that will close the window on that particular opportunity.

May 21 '24 19:05 callahad

Can you say more about how this approach doesn't feel quite right to you?

Sure. Let me list some of the challenges.

About the pipeline process. Unit supports pipeline requests. For any request, it doesn't handle the request after it. Since for content-length body, we know how much to read based on the content-length size from the request header. But for chunked body, the size isn't known until reading the data in the request body and parsed successfully for each chunk. So to solve the problem, we need to check if we need to change the design of limiting not to read or process the next request for the current request. I'd prefer to keep the design, but it looks like chunked body might break it.
About the body for size that is larger than the size of the buffer. In Unit, if the request body is larger than the buffer which is 8k size, the body will be dumped into a temporary file. For content-length request, after reading request header, we've known if the body will be saved into a memory buffer or a file buffer. But for chunked body, we have to check it while reading and parsing the body. To write a patch to be ready to review, I feel we need to change the implementation of content-length body reading and make it general. NGINX does it this way.
Ideally, we should send the chunked body to application process for the chunked request. Unfortunately, the application process only process content-length body. It's not a small change to support it. The imperfect way is to transfer the chunked body to content-length body before sending to application process, and it means we need to receive the entire body in this way.
The buffer type for the request body. For the size that is less than buffer size, Unit uses a memory buffer to save it. But if the size is larger than buffer size, Unit uses a file buffer instead. To note this point since this design has a lot to do with the challenge.

Is the current approach good enough for that, or do you think this approach is so flawed that you would block merging?

The patch I posted last week is for the demo asked Thursday, and it didn't solve the above challenges.

We have an internal opportunity with a constraint that requires landing functional (but not perfect) support for Transfer-Encoding: chunked in master this calendar month.

I'll try my best to provide a patch that can work with application processes, but I'm afraid it's not ready to be reviewed or merged into upstream. It needs to take more development time.

May 21 '24 22:05 hongzhidao

Hi @hongzhidao good job!

I see this is still marked as a draft, but I'll do a quick once over your latest patch while I notice some minor things...

One small other nit, perhaps you could reword the subject to something like

Support chunked request bodies

May 29 '24 15:05 ac000

One small other nit, perhaps you could reword the subject to something like Support chunked request bodies

Good suggestion, let me know if body needs to be replaced with bodies.

I see this is still marked as a draft

Yep, it's for the demo.

We need to land something ASAP, and should act as though the deadline to have a mergeable / testable branch is the end of this week.

Internally, I transferred the chunked request to content-length request since the application process can't handle chunked request now.

The design should be rewritten to make more sense. But it's not clear yet. Will do it after the demo.

May 29 '24 16:05 hongzhidao

One small other nit, perhaps you could reword the subject to something like Support chunked request bodies

Good suggestion, let me know if body needs to be replaced with bodies.

Yes, I think the plural bodes makes more sense here...

I see this is still marked as a draft

Yep, it's for the demo.

We need to land something ASAP, and should act as though the deadline to have a mergeable / testable branch is the end of this week.

Internally, I transferred the chunked request to content-length request since the application process can't handle chunked request now.

The design should be rewritten to make more sense. But it's not clear yet. Will do it after the demo.

Sure...

May 29 '24 16:05 ac000

Hi,

Ideally, we should send the chunked body to application process for the chunked request. Unfortunately, the application process only process content-length body. It's not a small change to support it. The imperfect way is to transfer the chunked body to content-length body before sending to application process, and it means we need to receive the entire body in this way.

I'm trying to get the application process to parse chunked request body.

Jun 03 '24 14:06 hongzhidao

@hongzhidao For the first release of this, let's ignore sending chunked data to application processes. It is acceptable to buffer and transform from chunked bodies to content-length for now.

Jun 04 '24 14:06 callahad

Hi all, welcome to review, but the max_body_size check for the chunked body is not supported yet. I'd suggest checking with @andrey-zelenkov's tests to understand how it is used. https://github.com/nginx/unit/pull/1307

Jun 11 '24 10:06 hongzhidao

Ignore the Fedora Rawhide failures, Rawhide seems broken at the moment...

Jun 11 '24 15:06 ac000

Hi @hongzhidao

You can add my

Reviewed-by: Andrew Clayton <[email protected]>

to

http: Move chunked buffer pos pointer while parsing

Jun 11 '24 15:06 ac000

Hi @hongzhidao

You can add my

Reviewed-by: Andrew Clayton <[email protected]>

to

http: Move chunked buffer pos pointer while parsing

Added, and I wonder if this means the review on the commit has been done when the Reviewed-by tag is asked to be added? If yes, I'll add the same tag on the other commits after they are reviewed.

Jun 12 '24 09:06 hongzhidao

Why remove the link to the chunked issue?\

EDIT: Hmm, I guess because this won't be enabled by default...

Jun 12 '24 17:06 ac000

Hi @hongzhidao You can add my
Reviewed-by: Andrew Clayton <[email protected]>
to
http: Move chunked buffer pos pointer while parsing
Added, and I wonder if this means the review on the commit has been done when the Reviewed-by tag is asked to be added? If yes, I'll add the same tag on the other commits after they are reviewed.

It means I have reviewed that particular patch. I'll notify you the same for the others...

Jun 12 '24 17:06 ac000

Why remove the link to the chunked issue?\

EDIT: Hmm, I guess because this won't be enabled by default...

I have no idea why it's removed, but I switched the PR status from draft to open.

Jun 13 '24 00:06 hongzhidao

OK @hongzhidao good stuff!

I think you can add my

Reviewed-by: Andrew Clayton <[email protected]>

to that last commit now...

Jun 19 '24 15:06 ac000

unit unit copied to clipboard

HTTP: request chunked body support.

unit
unit copied to clipboard