Use Balsa HTTP/1 codec
The current HTTP/1 codec is http-parser. The plan is to transition Envoy to use Balsa instead. [edit: fix link]
cc @envoyproxy/envoy-maintainers in case this wasn't on everyone's radar already
@bencebeky hi, could you compare Balsa with http-parser to show the advantage of transition?
Yeah, is there any reason to make this transition?
This is related to #10988. The current http_parser is no longer maintained and adding features such as #18819 is non-trivial.
Ah, I should have cross-linked when I was triaging. The decision points largely documented at https://docs.google.com/document/d/1Z8neuR6GTL61EU8jC88SD2as6Nmq15B-BNWC-SBVKns/edit after a couple of rounds of discussion at the community meeting (notes here https://docs.google.com/document/d/1Yc4zkV-A_cC_R3C0u6D15WQAsRSxADs8p2l8UMMa4P4/edit# as it's not convenient for all time zones)
FYI, the balsa link is broken. They just moved it to: https://github.com/google/quiche/tree/main/quiche/balsa.
FYI, the balsa link is broken. They just moved it to: https://github.com/google/quiche/tree/main/quiche/balsa.
Thank you. I edited to OP for convenience.
This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.
no stalebot
Note that BalsaParser is being developed to provide feature (and bug) parity with http-parser to the highest reasonable extent. This includes identical error messages, some of which have "HPE_" prefix (for "http-parser error"), and some of which are hardcoded in an awkward manner, see #22447 for an example. This is to minimize friction during the transition. After the transision is complete, these error messages, along with other peculiarities, can be eliminated by behavior-changing PR, using runtime flag protection as appropriate.
Ah, I should have cross-linked when I was triaging. The decision points largely documented at https://docs.google.com/document/d/1Z8neuR6GTL61EU8jC88SD2as6Nmq15B-BNWC-SBVKns/edit
@alyssawilk This doc is not public or doesn't exist
@bencebeky was this your doc?
or @yanavlasov codec analysis? I can't recall :-/
We're keen to do some verification on balsa however black box testing is risky for us. Some document is helpful for us to see the impact/possible regression
@bencebeky can you share the internal docs of known differences (i.e. lack thereof?) I think it largely came down to the one we just fixed, and slight difference in max header byte accounting.
the tl;dr for migration was http_parser is abandonware, getting fixes to security issues has proven quite difficult, getting enhancements largely implausible, so the Envoy maintainers agreed migration was desirable. For why balsa, largely because it's been deployed at scale in Google for 15 years, and because Google was willing to do the work to migrate off a deprecated parser if it was one trusted at scale, and knew wouldn't be abandoned, and no one else offered up cycles to do the port :-)
We've been running balsa internally for a month in some internal deployments for about a month now with no issue.
Happy to answer any other questions you might have on github/slack/vc :-)
Hi. Here's the public fork of the document discussing behavioral differences: https://docs.google.com/document/d/1FSpeF7QlmkQ4JhfpKWzEL0UKM3OLTTnd1VJJtnFlq64.
We have also done load testing manually, and found no significant difference in memory or CPU usage. Also did some testing with synthetic data to find differences in behavior, but found none. Fuzz testing found the extended ASCII issue, which is being fixed.
Thanks. Those help a lot and we'll move on some internal verification :) Would post if anything is concerned
awesome. please update either way, as once you folks have tested as well we'll flip it true by default :-)
Hi @bencebeky. I've recently bumped into the lack of custom methods support issue #18819 in one of our prod environments. Interestingly http-parser does support some custom methods but they are all hardcoded.
It looks like the solution to custom methods is to migrate to this Balsa library. Could you please clarify where is this work, and when is it expected to land? Is there possibly a way to use Balsa even today (maybe with some limitations/corner cases)? Thanks!
Iirc, balsa is enabled by default after envoy 1.28.
Hi, as noted above, Balsa is default enabled. However, this is only the first step towards allowing arbitrary methods. BalsaParser has the list of allowed methods copied over verbatim from http-parser, see https://github.com/envoyproxy/envoy/blob/2970ddbd4ade787dd51dfbe605ae2e8c5d8ffcf7/source/common/http/http1/balsa_parser.cc#L54-L59. The two parsers behave identically to minimize friction due to migrating from one parser to the other.
There's a related project called UHV ("universal header validator" or "unified header validation") that moves validation logic, including method validation, from the individual HTTP/1, HTTP/2, HTTP/3 codec implementations to a centralized place. It will provide a configuration setting for enabling custom methods.
Thanks @bencebeky!
Would envoy community accept addition of new methods to Balsa? The method we need right now is the BITS_POST which is used by BITS Upload Protocol. It acts exactly like POST, just has different name.
The UHV work - is it tracked via https://github.com/envoyproxy/envoy/issues/10646?
Yep, that's the correct issue for UHV. You should be able to test with UHV on using envoy.reloadable_features.enable_universal_header_validator though there may be some minor differences not yet runtime guarded. I believe once you're using balsa and uhv you can follow thorugh on the TODO to complete defer method validation ot UHV and extend UHV to allow custom methods. Happy to do the review along with @yanavlasov