envoy icon indicating copy to clipboard operation
envoy copied to clipboard

Use Balsa HTTP/1 codec

Open bencebeky opened this issue 3 years ago • 10 comments

The current HTTP/1 codec is http-parser. The plan is to transition Envoy to use Balsa instead. [edit: fix link]

bencebeky avatar May 11 '22 17:05 bencebeky

cc @envoyproxy/envoy-maintainers in case this wasn't on everyone's radar already

alyssawilk avatar May 11 '22 17:05 alyssawilk

@bencebeky hi, could you compare Balsa with http-parser to show the advantage of transition?

daixiang0 avatar May 12 '22 00:05 daixiang0

Yeah, is there any reason to make this transition?

wbpcode avatar May 12 '22 01:05 wbpcode

This is related to #10988. The current http_parser is no longer maintained and adding features such as #18819 is non-trivial.

adisuissa avatar May 12 '22 14:05 adisuissa

Ah, I should have cross-linked when I was triaging. The decision points largely documented at https://docs.google.com/document/d/1Z8neuR6GTL61EU8jC88SD2as6Nmq15B-BNWC-SBVKns/edit after a couple of rounds of discussion at the community meeting (notes here https://docs.google.com/document/d/1Yc4zkV-A_cC_R3C0u6D15WQAsRSxADs8p2l8UMMa4P4/edit# as it's not convenient for all time zones)

alyssawilk avatar May 12 '22 14:05 alyssawilk

FYI, the balsa link is broken. They just moved it to: https://github.com/google/quiche/tree/main/quiche/balsa.

kanongil avatar May 12 '22 14:05 kanongil

FYI, the balsa link is broken. They just moved it to: https://github.com/google/quiche/tree/main/quiche/balsa.

Thank you. I edited to OP for convenience.

bencebeky avatar May 12 '22 14:05 bencebeky

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jun 11 '22 16:06 github-actions[bot]

no stalebot

bencebeky avatar Jun 14 '22 16:06 bencebeky

Note that BalsaParser is being developed to provide feature (and bug) parity with http-parser to the highest reasonable extent. This includes identical error messages, some of which have "HPE_" prefix (for "http-parser error"), and some of which are hardcoded in an awkward manner, see #22447 for an example. This is to minimize friction during the transition. After the transision is complete, these error messages, along with other peculiarities, can be eliminated by behavior-changing PR, using runtime flag protection as appropriate.

bencebeky avatar Aug 09 '22 14:08 bencebeky

Ah, I should have cross-linked when I was triaging. The decision points largely documented at https://docs.google.com/document/d/1Z8neuR6GTL61EU8jC88SD2as6Nmq15B-BNWC-SBVKns/edit

@alyssawilk This doc is not public or doesn't exist

JuniorHsu avatar May 24 '23 16:05 JuniorHsu

@bencebeky was this your doc?

alyssawilk avatar May 25 '23 16:05 alyssawilk

or @yanavlasov codec analysis? I can't recall :-/

alyssawilk avatar May 25 '23 16:05 alyssawilk

We're keen to do some verification on balsa however black box testing is risky for us. Some document is helpful for us to see the impact/possible regression

JuniorHsu avatar Jun 03 '23 18:06 JuniorHsu

@bencebeky can you share the internal docs of known differences (i.e. lack thereof?) I think it largely came down to the one we just fixed, and slight difference in max header byte accounting.

the tl;dr for migration was http_parser is abandonware, getting fixes to security issues has proven quite difficult, getting enhancements largely implausible, so the Envoy maintainers agreed migration was desirable. For why balsa, largely because it's been deployed at scale in Google for 15 years, and because Google was willing to do the work to migrate off a deprecated parser if it was one trusted at scale, and knew wouldn't be abandoned, and no one else offered up cycles to do the port :-)

We've been running balsa internally for a month in some internal deployments for about a month now with no issue.

Happy to answer any other questions you might have on github/slack/vc :-)

alyssawilk avatar Jun 05 '23 12:06 alyssawilk

Hi. Here's the public fork of the document discussing behavioral differences: https://docs.google.com/document/d/1FSpeF7QlmkQ4JhfpKWzEL0UKM3OLTTnd1VJJtnFlq64.

We have also done load testing manually, and found no significant difference in memory or CPU usage. Also did some testing with synthetic data to find differences in behavior, but found none. Fuzz testing found the extended ASCII issue, which is being fixed.

bencebeky avatar Jun 05 '23 20:06 bencebeky

Thanks. Those help a lot and we'll move on some internal verification :) Would post if anything is concerned

JuniorHsu avatar Jun 05 '23 20:06 JuniorHsu

awesome. please update either way, as once you folks have tested as well we'll flip it true by default :-)

alyssawilk avatar Jun 05 '23 20:06 alyssawilk

Hi @bencebeky. I've recently bumped into the lack of custom methods support issue #18819 in one of our prod environments. Interestingly http-parser does support some custom methods but they are all hardcoded.

It looks like the solution to custom methods is to migrate to this Balsa library. Could you please clarify where is this work, and when is it expected to land? Is there possibly a way to use Balsa even today (maybe with some limitations/corner cases)? Thanks!

andreycpp avatar Nov 05 '23 18:11 andreycpp

Iirc, balsa is enabled by default after envoy 1.28.

wbpcode avatar Nov 06 '23 14:11 wbpcode

Hi, as noted above, Balsa is default enabled. However, this is only the first step towards allowing arbitrary methods. BalsaParser has the list of allowed methods copied over verbatim from http-parser, see https://github.com/envoyproxy/envoy/blob/2970ddbd4ade787dd51dfbe605ae2e8c5d8ffcf7/source/common/http/http1/balsa_parser.cc#L54-L59. The two parsers behave identically to minimize friction due to migrating from one parser to the other.

There's a related project called UHV ("universal header validator" or "unified header validation") that moves validation logic, including method validation, from the individual HTTP/1, HTTP/2, HTTP/3 codec implementations to a centralized place. It will provide a configuration setting for enabling custom methods.

bencebeky avatar Nov 06 '23 20:11 bencebeky

Thanks @bencebeky!

Would envoy community accept addition of new methods to Balsa? The method we need right now is the BITS_POST which is used by BITS Upload Protocol. It acts exactly like POST, just has different name.

The UHV work - is it tracked via https://github.com/envoyproxy/envoy/issues/10646?

andreycpp avatar Nov 08 '23 16:11 andreycpp

Yep, that's the correct issue for UHV. You should be able to test with UHV on using envoy.reloadable_features.enable_universal_header_validator though there may be some minor differences not yet runtime guarded. I believe once you're using balsa and uhv you can follow thorugh on the TODO to complete defer method validation ot UHV and extend UHV to allow custom methods. Happy to do the review along with @yanavlasov

alyssawilk avatar Nov 13 '23 18:11 alyssawilk