Remove chunked encoding in http/1 responses
Chunked encoding is stripped from the response while it's being parsed, if the response is forwarded to h1 client, the encoding is applied back for correct message framing. Since HTTP/2 is first citizen, we don't worry about performance degradation for http1 clients, and this solution allows us to avoid any further worries about unneeded chunked encoding in other subsystems like cache.
Obsoletes #1410 Fixes #1379
Instead of stripping skb fragments during parsing, another approach is used:
a new TfwStr cut member was added into http parser object to track all the fragments to be removed from the message. It's generic enough to allow removal of any fragments from
both response and request.
Once parser has finished its job (on both TFW_POSTPONE TFW_PASS return codes), the
cut storage is stripped from the message. Since http message is processed skb-by-skb, the cut function will be called for every skb only once. This approach doesn't modify skb in parser internals, thus there is no need to pass modified skb parts to lower levels, which complicates code and makes it hardly tangled.
Since cut may (and will!) grow together with usual strings not supposed to be removed, a realloc-race is possible: both strings will grow together, which will block zero-copy realloc operations. To avoid that I've created a new TfwPool for parser needs. It's created for every new server connection. Once message is parsed and whole cut object was stripped, the memory required for cut object is released. This is not usual, since we normally never free objects from pools manually, but it allows to have only one pool for all messages in the single connections, and it shouldn't be exhausted.
Thus during parsing, the body from h1 responses is split into two parts: resp->body and parser->cut. First contains only data, the second - only chunk descriptors. Current realisation uses the same code for request and response parsing, and makes it effectively: usually MOVE_fixup macroses creates a lot of chunks with 1 or two characters, which is not suitable for large bodies. Instead chunks are combined together by their role, i.e. adjastment TfwStr chunks with chunk descriptors are glued together. According to my observations, it allows to have 3-6 times less TfwStr chunks to store single chunk descriptor.
Unfortunately the same approach can't be used to remove chunked token from the Transfer-Encoding header. Server is allowed to apply chunked as not final encoding, so chunked token must be stripped only if it's final encoding. We can't guarantee this during parsing, so the chunked token is marked with flag, and removed later manually.
UPD: See comment below
For stripping chunked parts at body parsing stage we flag chunks that should be cut, then cut chunked parts from body during HTTP1 to HTTP2 message transformation. For cached responses we always store into cache HTTP2 friendly version. During caching we not modify message, only copy data from body excluding chunks. Modification take place only during HTTP2 framing. For non-cachable responses to HTTP1 client simply forward message without any modification.
Besides body modification we also remove Transfer-Encoding header from response to HTTP2 client. It compliance with RFC 9113 8.2.2. During caching we also ignore Transfer-Encoding and not place into cache. All encodings from Transfer-Encoding move to Content-Encoding header, except chunked encoding.
If message doesn't have Transfer-Encoding: chunked and Content-Length when backend close connection, we try to determine size of the body and then add Content-Length header and forward it to client, however such messages are not cached, because we can't be ensured that message is fully received.
I get errors which are not in master:
- h2 request with 'Transfer-Encoding: chunked' (capitalized) header and chunked body. Header is not cut as expected.
- h2 request with 'Content-Length: 5 (capitalized) header. Tempesta does not see this header and add new one.
- h2 request with 'transfer-encoding: chunked' (lowercase) header. Tempesta failed with:
dmesg
[80145.842735] ------------[ cut here ]------------
[80145.844171] WARNING: CPU: 3 PID: 181027 at /root/tempesta/fw/http_stream.c:80 tfw_h2_stream_fsm+0x12d/0x2b0 [tempesta_fw]
[80145.845916] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) btrfs blake2b_generic xor raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs libcrc32c cpuid binfmt_misc nfnetlink_queue nfnetlink_log nfnetlink cfg80211 iptable_mangle xt_mark bpfilter sha256_ssse3 sha512_ssse3 rfcomm bnep vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport vsock nls_iso8859_1 intel_rapl_msr intel_rapl_common crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper snd_ens1371 snd_ac97_codec gameport ac97_bus snd_pcm snd_seq_midi btusb snd_seq_midi_event btrtl snd_rawmidi btbcm btintel snd_seq bluetooth snd_seq_device snd_timer vmw_balloon joydev input_leds snd vmw_vmci serio_raw ecdh_generic ecc soundcore mac_hid sch_fq_codel vmwgfx ttm drm_kms_helper cec rc_core fb_sys_fops syscopyarea sysfillrect sysimgblt ipmi_devintf ipmi_msghandler drm msr parport_pc ppdev lp parport ramoops pstore_blk reed_solomon efi_pstore pstore_zone
[80145.845994] ip_tables x_tables autofs4 hid_generic usbhid hid mptspi mptscsih mptbase psmouse crc32_pclmul e1000 ahci scsi_transport_spi libahci i2c_piix4 pata_acpi [last unloaded: tempesta_lib]
[80145.862854] CPU: 3 PID: 181027 Comm: python3 Kdump: loaded Tainted: G OE 5.10.35+ #1
[80145.864392] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
[80145.866416] RIP: 0010:tfw_h2_stream_fsm+0x12d/0x2b0 [tempesta_fw]
[80145.867418] Code: 89 a0 01 00 00 0f 0b 83 f8 06 74 6b 83 f8 07 75 f4 45 31 c0 41 80 fd 02 0f 84 56 ff ff ff 41 80 fd 09 0f 85 46 ff ff ff eb 8c <0f> 0b 41 b8 03 00 00 00 e9 3d ff ff ff 41 b8 03 00 00 00 e9 3c ff
[80145.870504] RSP: 0018:ffffa6cac05484c8 EFLAGS: 00010293
[80145.871352] RAX: 0000000000000002 RBX: ffff8962f71bba68 RCX: 0000000000000001
[80145.872478] RDX: 0000000000000001 RSI: 0000000000000001 RDI: ffff8962f71bba98
[80145.873803] RBP: ffffa6cac05484f8 R08: ffffa6cac0548514 R09: 0000000000000000
[80145.874969] R10: 0000000000000002 R11: 0000000000000000 R12: ffff8962f71bba98
[80145.876079] R13: 0000000000000001 R14: 0000000000000001 R15: 0000000000000001
[80145.877197] FS: 00007f3aa3ab3700(0000) GS:ffff8964f5ec0000(0000) knlGS:0000000000000000
[80145.878681] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[80145.879604] CR2: 00007f3aa3cd0b70 CR3: 0000000181cd0003 CR4: 0000000000370ee0
[80145.880785] Call Trace:
[80145.881190] <IRQ>
[80145.881590] tfw_h2_stream_id_close+0xe7/0x110 [tempesta_fw]
[80145.882767] tfw_h2_prep_resp+0x95/0x310 [tempesta_fw]
[80145.883618] ? alloc_pages_current+0x87/0xe0
[80145.884296] tfw_h2_send_resp+0x5c/0xb0 [tempesta_fw]
[80145.885095] tfw_h2_send_err_resp+0x9a/0xc0 [tempesta_fw]
[80145.886092] tfw_http_cli_error_resp_and_log+0x3d3/0x440 [tempesta_fw]
[80145.888196] ? tfw_http_parse_req+0xf1d0/0xf1d0 [tempesta_fw]
[80145.893988] tfw_http_req_process+0xf8/0x970 [tempesta_fw]
[80145.896972] ? alloc_pages_current+0x87/0xe0
[80145.897810] ? __get_free_pages+0x11/0x40
[80145.898931] ? tfw_pool_alloc_pages+0x47/0x50 [tempesta_fw]
[80145.900418] ? __tfw_pool_new+0x2d/0x70 [tempesta_fw]
[80145.901242] tfw_http_msg_process_generic+0x197/0x670 [tempesta_fw]
[80145.902372] ? ss_skb_chop_head_tail+0xbb/0x1d0 [tempesta_fw]
[80145.903356] ? memcpy_fast+0xe/0x10 [tempesta_lib]
[80145.904136] tfw_h2_frame_process+0x308/0x440 [tempesta_fw]
[80145.905051] tfw_http_msg_process+0x42/0x50 [tempesta_fw]
[80145.906162] tfw_connection_recv+0x59/0xa0 [tempesta_fw]
[80145.907147] tfw_tls_connection_recv+0x19d/0x380 [tempesta_fw]
[80145.908124] ss_tcp_process_data+0x1e7/0x3f0 [tempesta_fw]
[80145.909102] ss_tcp_data_ready+0x44/0xb0 [tempesta_fw]
[80145.910216] tcp_data_ready+0x2b/0xd0
[80145.910985] tcp_data_queue+0x483/0xd30
[80145.911693] tcp_rcv_established+0x230/0x670
[80145.912429] ? sk_filter_trim_cap+0xde/0x240
[80145.913220] tcp_v4_do_rcv+0x140/0x200
[80145.914005] tcp_v4_rcv+0xcfd/0xe10
[80145.915113] ? iptable_mangle_hook+0x36/0xf0 [iptable_mangle]
[80145.916227] ip_protocol_deliver_rcu+0x30/0x1b0
[80145.917191] ip_local_deliver_finish+0x48/0x60
[80145.918099] ip_local_deliver+0xfa/0x110
[80145.918787] ? ip_protocol_deliver_rcu+0x1b0/0x1b0
[80145.919544] ip_rcv_finish+0x87/0xa0
[80145.920114] ip_rcv+0xcc/0xe0
[80145.920605] ? ip_rcv_finish_core.isra.0+0x420/0x420
[80145.921421] __netif_receive_skb_one_core+0x88/0xa0
[80145.922317] __netif_receive_skb+0x18/0x60
[80145.923078] process_backlog+0xa9/0x160
[80145.923741] net_rx_action+0x13e/0x390
[80145.924383] __do_softirq+0xd9/0x291
[80145.924971] asm_call_irq_on_stack+0x12/0x20
[80145.925681] </IRQ>
[80145.926141] do_softirq_own_stack+0x3d/0x50
[80145.926848] do_softirq.part.0+0x46/0x50
[80145.927486] __local_bh_enable_ip+0x50/0x60
[80145.928197] ip_finish_output2+0x1ab/0x590
[80145.928846] ? __cgroup_bpf_run_filter_skb+0x3c3/0x3d0
[80145.929715] __ip_finish_output+0xd8/0x220
[80145.931006] ip_finish_output+0x2d/0xb0
[80145.931722] ip_output+0x7a/0x100
[80145.932263] ? __ip_finish_output+0x220/0x220
[80145.932970] ip_local_out+0x3d/0x50
[80145.933535] __ip_queue_xmit+0x17a/0x470
[80145.934321] ? get_page_from_freelist+0xed8/0x1100
[80145.935088] ip_queue_xmit+0x15/0x20
[80145.935643] __tcp_transmit_skb+0xa94/0xc90
[80145.936301] tcp_write_xmit+0x453/0x1350
[80145.936915] __tcp_push_pending_frames+0x37/0x100
[80145.937657] tcp_push+0xfc/0x100
[80145.938275] tcp_sendmsg_locked+0xd36/0xe70
[80145.938949] tcp_sendmsg+0x2d/0x50
[80145.939477] inet_sendmsg+0x43/0x70
[80145.940057] sock_sendmsg+0x5e/0x70
[80145.940599] sock_write_iter+0x93/0xf0
[80145.941195] new_sync_write+0x192/0x1b0
[80145.941787] vfs_write+0x1ca/0x280
[80145.942427] ksys_write+0xb1/0xe0
[80145.942976] __x64_sys_write+0x1a/0x20
[80145.943598] do_syscall_64+0x38/0x90
[80145.944165] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[80145.944964] RIP: 0033:0x7f3aa7b1e0af
[80145.945578] Code: 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 49 65 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2d 44 89 c7 48 89 44 24 08 e8 7c 65 f8 ff 48
[80145.948602] RSP: 002b:00007f3aa3ab0790 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[80145.949911] RAX: ffffffffffffffda RBX: 0000000000000091 RCX: 00007f3aa7b1e0af
[80145.951052] RDX: 0000000000000091 RSI: 00007f3a9c004113 RDI: 0000000000000007
[80145.952164] RBP: 00007f3a9c004030 R08: 0000000000000000 R09: 0000000000000008
[80145.953261] R10: 00007f3a9c004115 R11: 0000000000000293 R12: 00007f3a9c004113
[80145.954484] R13: 0000000000000091 R14: 00007f3a9c001f80 R15: 0000000000000000
[80145.955749] ---[ end trace 1122a337d9c74038 ]---