httparse icon indicating copy to clipboard operation
httparse copied to clipboard

Support custom Version Parsers when protocol name is not HTTP

Open tsnoam opened this issue 4 years ago • 7 comments

In order to parse HTTP based protocol (like RTSP) for example, it is required to know what is the name of the protocol.

  1. This PR attempts very hard to keep performance as close as possible to the original (benchmarks agree on that).
  2. In order to maintain the current semantics and optimizations of the code, I've made an assumption that HTTP based protocol names are 4 characters (bytes) long. It might not be a correct assumption, though I believe it can be the beginning of a discussion.

tsnoam avatar Feb 19 '20 11:02 tsnoam

Sorry, I'm ignorant of RTSP. So, it's a protocol that looks just like HTTP/1, just with a different name? Are the multiple versions?

I have a strong feeling like this is out of scope for the library, but I'm slightly intrigued at how simple the change looks...

seanmonstar avatar Feb 19 '20 18:02 seanmonstar

Hi @seanmonstar , Sorry for the late reply, somehow I missed your comment.

RTSP looks exactly like HTTP, with the small difference that "RTSP" is written instead of "HTTP". https://en.wikipedia.org/wiki/Real_Time_Streaming_Protocol The version is always "1.0".

I can see why you tend to think that this is out of scope for this library. However, there are several protocols out there which looks just like HTTP (recently i've become acquainted with SIP). Parsing these protocols using this library seems (at least to me) as the obvious thing (instead of reinventing the wheel / forking / etc.).

I'm trying to think of a way to modify the library in a manner which will be more robust into different protocol names (this practically the only difference between the protocols). How would you feel about changing the library so instead of passing a concrete data type (HttpVersionParser in the current version of the PR) we'd pass a reference to a trait implementation (&dyn HttpVersionParser) allowing users to really customize the version parsing?

tsnoam avatar Mar 01 '20 16:03 tsnoam

I'd love for something like this to be possible. Currently I'm using rtsparse which is basically just a fork with HTTP replaced with RTSP everywhere.

What would be awesome long term is setting this option when using hyper, or even warp, to be able to easily build servers for HTTP-esque protocols. I totally understand you being hesitant because this could lead to major bloat, but just being able to change the version would already be enough for a lot of extra use-cases with very little effort.

ThinkChaos avatar Jun 10 '20 10:06 ThinkChaos

@tsnoam how much dynamic dispatch will decrease performance?

prostomarkeloff avatar Jun 10 '20 12:06 prostomarkeloff

@prostomarkeloff AFAIK benches only work with nightly compiler. This is the version I used:

» rustc +nightly --version
rustc 1.42.0-nightly (13db6501c 2020-02-01)

v1.3.4 tag:

» git rev-parse --short HEAD
6f696f5

» cargo +nightly bench
<snip>
test bench_httparse       ... bench:         207 ns/iter (+/- 13) = 3478 MB/s
test bench_httparse_short ... bench:          40 ns/iter (+/- 1) = 1700 MB/s
test bench_pico           ... bench:         341 ns/iter (+/- 30) = 2111 MB/s
test bench_pico_short     ... bench:          40 ns/iter (+/- 6) = 1700 MB/s
<snip>

tsnoam:master branch:

» git rev-parse --short HEAD
62ab592

» cargo +nightly bench
<snip>
test bench_httparse       ... bench:         206 ns/iter (+/- 22) = 3495 MB/s
test bench_httparse_short ... bench:          41 ns/iter (+/- 4) = 1658 MB/s
test bench_pico           ... bench:         336 ns/iter (+/- 23) = 2142 MB/s
test bench_pico_short     ... bench:          39 ns/iter (+/- 4) = 1743 MB/s
<snip>

tsnoam avatar Jun 11 '20 06:06 tsnoam

the benchmarks were executed on my laptop: Intel(R) Core(TM) i7-10510U CPU @ 1.80GHz (a single socket, quad core, hyperthreaded cpu)

I tried to make the environment as "quiet" as possible, but you can never know if and what influenced the benches...

tsnoam avatar Jun 11 '20 06:06 tsnoam

There is now a ParserConfig in this crate, you may want to revisit your PR to implement that tweak as an additional setting in ParserConfig.

nox avatar Jan 13 '22 12:01 nox