nghttp2 icon indicating copy to clipboard operation
nghttp2 copied to clipboard

nghttp2_hd_inflate_hd_nv() refactoring

Open yzhao1012 opened this issue 6 years ago • 5 comments

Hi All,

We'd see how people like the idea of refactoring nghttp2_hd_inflate_hd_nv() such that:

  1. There is a API that just decode compressed header fields. And also perform huffman decoding. But do not perform dynamic table look up. Static table is fine.
  2. Another API then takes the results to perform dynamic table look up.

For #1, the default behavior is to terminate early, if the decoded index is not within the bound of the dynamic table. But we do need to add the option to continue process.

For #2, the behavior should be changed to also tolerate certain error, for example, if the dynamic table space is exhausted, the function should have the option to proceed to process other fields. With an option in the input argument, most likely a field inside nghttp2_inflater, to indicate such directive.

The motivation is that, we use nghttp2 for parsing http2 traffic sniffed from network. Being a sniffer, data loss is common, so the dynamic table is no longer accurate. We want to preserve the header indexes for further processing (which I cannot disclose).

ATM, we wrote our own header field decoder, using nghttp2_hd_decode_length(). That is gradually becoming problematic. As we are looking to use huffman decoding code as well, which is not straightforward for integration because its use of internal data structure nghttp2_bufs, which we want to avoid.

I took a look at the code. The changes needed seem to be:

  • Augment nghttp2_nv to record index.
  • Move out the part the manipulate dynamic table out of nghttp2_hd_inflate_hd_nv() into a new API for #2 above.

I think this refactoring should improve the structure of nghttp2 code. That makes it a win-win situation.

WDYT? Comments on the technical feasibility is also welcome.

yzhao1012 avatar Sep 05 '19 23:09 yzhao1012

nghttp2_wireshark_patch.txt

Hi, I'm doing the same for Wireshark but I modified/ will modify nghttp2_hd_inflate_hd_nv() to take a dissection argument 1/0 (TRUE/FALSE) and act accordingly. I have working code. Would this be accepted if offered as a pull request?

AndersBroman avatar May 13 '20 12:05 AndersBroman

Hi, for Wireshark we have similar requirements to continue parsing even if the dynamic table index is not found. At the moment, the requirement is being able to support captures where the begin of the TCP stream is missing, but not necessarily the middle. This means that we can assume that any existing item in the dynamic table is definitely valid, but there may be indices with a missing entry.

In order to support this scenario and not fail on missing entries, we can fill the dynamic table with dummy entries prior to actual processing of the HEADERS. A disadvantage of that approach is a fixed minimum memory cost for every connection, and potentially being incomplete when the dynamic table size has increased. However, in general this could potentially work.

As for the suggestions from @yzhao1012:

Augment nghttp2_nv to record index.

The structure cannot be extended with a new field as that breaks ABI. A potential solution is to introduce a new nghttp2_nv_flag and overload one of the other fields to record the index.

There is a API that just decode compressed header fields. And also perform huffman decoding. But do not perform dynamic table look up. Static table is fine.

What kind of error do you want to tolerate? Omission of just the begin of a trace, or also arbitrary Header Block Fragments in the middle of streams? In the latter case, the contents of dynamic table will be unreliable. In the former case, lookup into the dynamic table is possible. If you're lucky, the entry exists. Otherwise it currently fails, but the API could be modified to return a special nghttp2_nv instead with the index.

Another API then takes the results to perform dynamic table look up.

That API already exists, it is nghttp2_hd_inflate_get_table_entry.


Unfortunately even with the above changes, any decoding error will result in a sticky bad error flag that will prevent the decoder from being used: https://github.com/nghttp2/nghttp2/blob/ef41583614e95efd12b6cce821e34837c1b28ed0/lib/nghttp2_hd.c#L1478-L1482 Decoding errors include:

  • Memory allocation failures.
  • Implementation errors that cause protocol violations such as:
    • Missing a Dynamic Table Size Update when expected.
    • Dynamic Table Size Update not at the start of a headers block.
    • Name length 0.
    • Huffman decompression errors.
  • A Dynamic Table Size Update with a value larger than SETTINGS_HEADER_TABLE_SIZE. This could happen when the capture is missing the SETTINGS frame that increased the table size, or due to protocol violations.
  • A dynamic table index that points to a non-existing entry. Either because previous dynamic table updates were missing in the capture, or because a previous SETTINGS frame is missing that increased the size of the dynamic table.
  • End of a headers block was signalled, but more data was available, or vice versa. This could occur if a HEADERS frame required a CONTINUATION frame that went missing.

If we can assume no implementation errors, no corruptions in the capture, no memory allocation failures, then we only have the following concerns:

  • HEADERS + CONTINUATION was sent, but CONTINUATION was missing. This is not a problem as long as there are no following other successive HEADERS/PUSH_PROMISE frame. Otherwise a new decoder has to be created.
  • HEADERS + CONTINUATION was sent, but HEADERS was missing. The dissector could skip such CONTINUATION frames if they are seen without a prior HEADERS frame.
  • Missing SETTINGS frame that increased the table size. A new option could be added to always increase the table size, disregarding the SETTINGS frame, but that may lead to resource exhaustion problems. The dissector could initialize the table to a large value as workaround.
  • The dynamic table index does not exist. A new option could be added to emit a new nghttp2_nv type that includes the index.

The last concern can be "solved" by populating the dynamic table as suggested before. However if a new option can be added, it could emit a nghttp2_nv structure with:

  • Flag: include a new flag such as NGHTTP2_NV_FLAG_NAME_MISSING.
  • namelen: keep it at 0 just in case users expect that "name" would be a special value otherwise.
  • name: overload it to imply the 32-bit index (incidentally the type of a SETTINGS_HEADERS_TABLE_SIZE), cast to a pointer type.
  • valuelen, value: emit if available.
  • Potential modes: emit only if the dynamic table entry is missing (best-effort dissection) or emit it for all dynamic table entries (requested by @yzhao1012).

To allow arbitrary dynamic indices, the get_max_index call here should be replaced by NGHTTP2_HD_MAX_NV when the mode is activated (and of course check afterwards to detect whether a non-existing entry was referenced): https://github.com/nghttp2/nghttp2/blob/ef41583614e95efd12b6cce821e34837c1b28ed0/lib/nghttp2_hd.c#L1963-L1964

This new option could be activated with a new flag such as NGHTTP2_HD_INFLATE_DANGEROUS = 0x04.

@tatsuhiro-t What do you think of the new proposal to make headers available for unknown dynamic table entries? A potential advantage is the ability to report which index was actually not found, and the ability to work without relying on fully populating the dynamic table.

The workaround to populate with dummy entries can be found in this Wireshark patch: https://code.wireshark.org/review/37278

Lekensteyn avatar May 22 '20 03:05 Lekensteyn

@Lekensteyn Thank you for detailed analysis. There are several recovery modes I can think of, say, just use static table, do not touch dynamic table after index out of range detected, and only decode literal/huffman string. At the moment, I'm not sure it can be done without much affecting normal execution path. Meanwhile, the wireshark workaround patch is very interesting and to me it would just work under the described situation. Perhaps, wait and see how the patch works and then revisit here later if it does not work well?

tatsuhiro-t avatar May 22 '20 13:05 tatsuhiro-t

As suggested earlier, a special flag could be added to export the index via nghttp2_nv as opposed to bailing out. Without this flag, the normal execution path will be unaffected.

I've updated the Wireshark patch, it seems that it could address the problem in an acceptable way that does not require further modifications to nghttp2. If this does not work, I'll follow up here. :)

Some experiences from the implementation:

  • After repeatedly calling nghttp2_hd_inflate_hd2 to fill the table, nghttp2_hd_inflate_end_headers must be called. I forgot to do this at first, and it failed to decoder when a new header block started with a "dynamic table size update" instruction.
  • Previously I hard-coded 4096/32 as the number of entries to be added. However, less iterations are needed to populate the table, namely 4096 / (32 + name_length + value_length).
  • In the worst case, previous (lost) header fields were very small, e.g. X: 1. However more likely, headers will be somewhat longer, e.g. Content-Type: text/html. To maximize the probability of valid dynamic indexes, the name and value length should be kept as small as possible. Otherwise earlier entries are evicted. I've chosen for <header> as name and an empty value.

Lekensteyn avatar May 22 '20 22:05 Lekensteyn

Hi,

@Lekensteyn When I run nghttpd and nghttp with '-c 0' option (--header-table-size=0) not to create a dynamic table, Wireshark does not decode header table size update in HEADERS frame correctly. (Server) nghttpd -v -c 0 8080 --no-tls --no-tls (Client) nghttp -v -c 0 <URL>

The wireshark prints out the log message "Warn unexpected decompression state: -523 != 12". As you know, this message is related to your workaround: https://code.wireshark.org/review/37278

It seems that fix_partial_header_dissection_support() is not necessary if settings_header_table_size is 0. I think that it is necessary to check if max_dynamic_table_size of inflater's context is zero so that skips the workaround. Comments are welcome.

condorda avatar May 28 '21 09:05 condorda