cheroot icon indicating copy to clipboard operation
cheroot copied to clipboard

WIP: Defer HTTP parsing to Hyper (h11 project)

Open ian-otto opened this issue 4 years ago β€’ 43 comments

❓ What kind of change does this PR introduce?

  • [ ] 🐞 bug fix
  • [X] 🐣 feature
  • [ ] πŸ“‹ docs update
  • [ ] πŸ“‹ tests/coverage improvement
  • [X] πŸ“‹ refactoring
  • [ ] πŸ’₯ other

πŸ“‹ What is the related issue number (starting with #)

Resolves #201

❓ What is the current behavior? (You can also link to an open issue here)

Currently, the Cheroot project has its own HTTP parser and logic.

❓ What is the new behavior (if this is a feature change)?

This PR will defer the primary HTTP parser to the h11 project's implementation.

πŸ“‹ Other information:

Depends on: https://github.com/python-hyper/h11/pull/99

πŸ“‹ Checklist:

  • [ ] I think the code is well written
  • [ ] I wrote good commit messages
  • [ ] I have squashed related commits together after the changes have been approved
  • [ ] Unit tests for the changes exist
  • [ ] Integration tests for the changes exist (if applicable)
  • [ ] I used the same coding conventions as the rest of the project
  • [ ] The new code doesn't generate linter offenses
  • [ ] Documentation reflects the changes
  • [ ] The PR relates to only one subject with a clear title and description in grammatically correct, complete sentences

This change is Reviewable

ian-otto avatar Jan 22 '20 00:01 ian-otto

Something of note, is I'm not sure whether I should anticipate supporting Python 2.7 specifically in this PR as it appears there is a discussion of its deprecation (#252)

ian-otto avatar Jan 22 '20 00:01 ian-otto

This pull request introduces 1 alert when merging 87bc9304450d159021deee4d625f5f5114a470ab into ee55bb12b24b8ea8f3f1fda39faf92370aab277e - view on LGTM.com

new alerts:

  • 1 for Unused local variable

lgtm-com[bot] avatar Jan 22 '20 00:01 lgtm-com[bot]

I'm not sure whether I should anticipate supporting Python 2.7 specifically in this PR as it appears there is a discussion of its deprecation

@ian-otto I wanted to deprecate Python 2.7 after the HTTP parser replacement so that the last py2-compatible release would get the changes and then we'd start dropping support for this. Will ~probably~ merge #243 too before that. And maybe even #256 if we agree on it. That said, this PR should also be tested against CherryPy 17 which is the last major version to support Python 2.

@njsmith thanks a lot for your comments here!

webknjaz avatar Jan 22 '20 22:01 webknjaz

So I've been peeking over the code reviews and I agree with most, if not all of them. I do have a few questions in regards to a few of them however.

There are a few instances where I'm attempting to take things from h11 (a lot of byte-strings), and shim them into the format that the old HTTPRequest class used, which increases the complexity and I believe a few of the errors I've made throughout.

@webknjaz Would you like me to attempt to make this change backwards compatible with the existing class, or throw that out in favor of increasing interoperability with h11? Note that some of the expectations (such as sending an HTTP 1.0 request and expecting an HTTP 1.0 response) are already going to be violated.

ian-otto avatar Jan 22 '20 23:01 ian-otto

@duper51 I think I'm okay with not keeping compat. But just make sure that it doesn't affect CherryPy (it shouldn't but still). I think it may break WSGI adapter on CherryPy's side so it's worth confirming. We may need to think about edge-cases that we need to test manually or implement in automatic tests when this PR is closer to being ready.

webknjaz avatar Jan 23 '20 00:01 webknjaz

FTR this document basically confirms Nathaniel's advice to always respond with HTTP/1.1: https://www.rfc-editor.org/rfc/rfc2145.html#section-2.3

webknjaz avatar Jan 26 '20 03:01 webknjaz

Codecov Report

Merging #262 into master will decrease coverage by 65%. The diff coverage is 10.63%.

@@             Coverage Diff             @@
##           master     #262       +/-   ##
===========================================
- Coverage   79.22%   14.22%   -65.01%     
===========================================
  Files          25       25               
  Lines        3981     3755      -226     
===========================================
- Hits         3154      534     -2620     
- Misses        827     3221     +2394

codecov[bot] avatar Jan 26 '20 09:01 codecov[bot]

@duper51 master got updated, time to rebase.

webknjaz avatar Feb 09 '20 20:02 webknjaz

@duper51 any help needed?

webknjaz avatar Mar 02 '20 11:03 webknjaz

Sorry about that, we're getting about to midterms. I have some changes that I am making, mostly focused on getting tests to work correctly, or disabling tests that no longer apply under the new parser. I was curious if you have any input on how I am going about integrating this or any thoughts on making it more concise/clear?

ian-otto avatar Mar 03 '20 19:03 ian-otto

It's alright. No new input so far except that maybe we need to contribute some things to h11 instead of trying to shim things on Cheroot side.

P.S. That "recheck" comment was to trigger the new Zuul CI integration I'm playing with on this PR.

webknjaz avatar Mar 03 '20 19:03 webknjaz

Looks like we need to add https://h11.readthedocs.io/ to intersphinx

webknjaz avatar Mar 03 '20 19:03 webknjaz

The HTTP/1.0 Connection: keep-alive pseudo-standard is currently not supported. (Note that this only affects h11 as a server, because h11 as a client always speaks HTTP/1.1.) Supporting this would be possible, but it's fragile and finicky and I'm suspicious that if we leave it out then no-one will notice or care. HTTP/1.1 is now almost old enough to vote in United States elections. I get that people sometimes write HTTP/1.0 clients because they don't want to deal with annoying stuff like chunked encoding, and I completely sympathize with that, but I'm guessing that you're not going to find too many people these days who care desperately about keep-alive and at the same time are too lazy to implement Transfer-Encoding: chunked. Still, this would be my bet as to the missing feature that people are most likely to eventually complain about...

As an aside, here is an excerpt from https://github.com/python-hyper/h11/blob/master/docs/source/supported-http.rst. It looks like the maintainer is not opposed to having the support as a feature, but isn't super interested in implementing it himself. We could potentially try and implement that on their end, but it would likely be a fairly large overhaul (supporting the HTTP/1.0 server side protocol would be required).

ian-otto avatar Mar 03 '20 21:03 ian-otto

Okay, let's not care about that then.

webknjaz avatar Mar 03 '20 22:03 webknjaz

The actual implementation of HTTP/1.0 keepalive would probably be pretty straightforward. The tricky part is that there's no spec, and the desired semantics are entirely based on folklore, so it's very hard to know whether you did it right or not.

njsmith avatar Mar 03 '20 22:03 njsmith

Were we to implement it, implement it similar to how Cheroot currently implements it and port the tests over. Would you be okay with something along those lines?

ian-otto avatar Mar 04 '20 00:03 ian-otto

@duper51 that sounds good to me. Given that CherryPy's been around for about 18 years and nobody complained, its implementation is as good as any...

webknjaz avatar Mar 04 '20 13:03 webknjaz

These are the latest commits that I have. Currently, I am working on some fixes for some conditions that arise in the tests, but overall it seems to be working barring edge cases.

There is still no CONNECT support, but I will be looking into that shortly after completing the test_conn tests.

ian-otto avatar Mar 19 '20 11:03 ian-otto

This pull request introduces 4 alerts when merging 5a5b1a2866dece4fcbb8b37944fa15465726b4a1 into 3d3ac5c93254857458982c44f5efb72dab180177 - view on LGTM.com

new alerts:

  • 3 for Wrong number of arguments in a class instantiation
  • 1 for Wrong number of arguments in a call

lgtm-com[bot] avatar Mar 19 '20 11:03 lgtm-com[bot]

This pull request introduces 2 alerts when merging 2be4b6de7adbdfba797673fc62529cf188de9488 into 96a6a1c6c339d7288445ab6081a57cd37220778f - view on LGTM.com

new alerts:

  • 2 for Wrong number of arguments in a class instantiation

lgtm-com[bot] avatar Mar 19 '20 23:03 lgtm-com[bot]

This pull request introduces 4 alerts when merging 79a0ee5ea38a45ab010ec7a4d088062f6f6f287a into 1959f3d9b310993f50935322b6e90cfe7a77e022 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 2 for Wrong number of arguments in a class instantiation

lgtm-com[bot] avatar Mar 20 '20 04:03 lgtm-com[bot]

This pull request introduces 4 alerts when merging 3aa94069ec486b1ebe6f01749875ccc17ceea1a5 into 1959f3d9b310993f50935322b6e90cfe7a77e022 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 2 for Wrong number of arguments in a class instantiation

lgtm-com[bot] avatar Mar 20 '20 21:03 lgtm-com[bot]

This pull request introduces 4 alerts when merging 5f3edf304ea16b2c748d152acdf318cbfc58b25d into 1959f3d9b310993f50935322b6e90cfe7a77e022 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 2 for Wrong number of arguments in a class instantiation

lgtm-com[bot] avatar Mar 21 '20 00:03 lgtm-com[bot]

This pull request introduces 4 alerts when merging bc1a0f44a2b320f751e0d6395d482d7df4f2d2ea into 1959f3d9b310993f50935322b6e90cfe7a77e022 - view on LGTM.com

new alerts:

  • 2 for Unused local variable
  • 2 for Wrong number of arguments in a class instantiation

lgtm-com[bot] avatar Mar 21 '20 00:03 lgtm-com[bot]

This pull request introduces 1 alert and fixes 1 when merging 4f14e471ef0990b6eeef505002a625ea18ae95cd into 1959f3d9b310993f50935322b6e90cfe7a77e022 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

fixed alerts:

  • 1 for Unused local variable

lgtm-com[bot] avatar Mar 21 '20 02:03 lgtm-com[bot]

This pull request introduces 1 alert and fixes 1 when merging e50fe201dba03f1ffddcd887f6baf7c42597f07c into 1959f3d9b310993f50935322b6e90cfe7a77e022 - view on LGTM.com

new alerts:

  • 1 for Unused local variable

fixed alerts:

  • 1 for Unused local variable

lgtm-com[bot] avatar Mar 21 '20 03:03 lgtm-com[bot]

So I noticed while performing a code review on this, the GitHub diff viewer appears to be very confused when displaying the diff between the two files. That's gonna make things somewhat more difficult to view what's changed. Is there any way to resolve that somehow?

Also, on my local machine, this appears to pass all tests, except two.

There are two tests that I'm a little worried about. One is the tls_client_auth test, which seems to have erratic behavior, but as far as I can tell, the stack traces produced don't seem to possess any of the cheroot server code.

The other is HTTPS over HTTP. I believe this fails for a similar reason to the bad newline test. I'm not super sure on how to resolve this one.

ian-otto avatar Mar 24 '20 03:03 ian-otto

@duper51

So I noticed while performing a code review on this, the GitHub diff viewer appears to be very confused when displaying the diff between the two files.

Do you mean between two different files? Like when you move code chunks? There's no way of fixing this. Local diff may be a bit helpful but usually, I can live with it being like this.

Also, the diff will be smaller once you revert all those unnecessary style changes.

appears to pass all tests, except two.

Also, linters are failing (https://github.com/cherrypy/cheroot/pull/262/checks?check_run_id=523525295#step:9:31). And Python 2.7 env is failing because you import ABC (https://github.com/cherrypy/cheroot/pull/262/checks?check_run_id=523525384#step:14:59).

So let's fix all of these issues to proceed with a deeper review.

webknjaz avatar Apr 08 '20 10:04 webknjaz

This pull request introduces 1 alert and fixes 1 when merging f4ea7da6ea3757af146b85dad2be274afe775b97 into 3f468deacbe18d09451f4b01a412e9331e0382bc - view on LGTM.com

new alerts:

  • 1 for Unused local variable

fixed alerts:

  • 1 for Unused local variable

lgtm-com[bot] avatar Apr 08 '20 10:04 lgtm-com[bot]

@ian-otto there's conflicts with master in this branch. Mind rebasing?

webknjaz avatar May 02 '20 19:05 webknjaz