redbot
redbot copied to clipboard
Detect multiple content-length headers
... including duplicates (warn)
Hey Mark,
I was thinking of taking this issue on, just to get to know RedBot code. However, I'm a bit confused as to what the desired change for this issue is. As far as I can see, duplicates for Content-Length are detected since the corresponding parse method is decorated with the SingleFieldValue decorator. Therefore, a message will be raised as defined by the decorator and the message itself.
So, in terms of detecting duplicates - this is already done. Or did you want to have a new message with a WARN level for Content-Length duplicates, rather than the generic BAD-level message?
Cheers!
Hi!
There are a couple of aspects to this.
First of all, Thor doesn't let responses with multiple Content-Length headers through; it considers that a hard error. There is an "inspecting" mode that will let it continue, but I haven't yet turned that on when RED uses Thor.
Then, like you mention, it would be good to have a specific message about this, rather than the generic one. That probably means removing the SingleFieldValue decorator and coding something more specific.
Finally, it all needs to be tested; I haven't yet turned on inspecting mode anywhere except when running Thor's unit tests. Just a smoke test with a server that sends two content-lengths would be nice, as a sanity check (I can do that if you don't want to).
Having more contributors would be great, please speak up if you need any help / guidance / etc.!
Thanks,
Oh, and there's a new "Security" category for messages; this would probably be appropriate for that.
Well, that clears up a few things, thanks! :) I'll have a better look at the code and report back here if I hit any walls. I also don't mind doing the sanity-check server stuff -- I just need to get the general feeling about redbot's internals.
Hi Mark -- sorry for not getting back sooner.
Anyway, I played a bit with thor and redbot. Here's where I'm at now: https://github.com/izuzak/redbot/commit/40b85b037275306ebf6cdb9d1958a0ab1ecee974
- I enabled inspecting mode with the following in redbot/fetch.py in the run method
- I changed the SingleFieldValue decorator to check for both duplicate headers with the same value and duplicate headers with different values
- I added two new test cases to content_length.py
- I made a simple dummy HTTP server using Thor which returns duplicate Content-Length headers (notice that the headers have different values)
- if I run redbot tests, all test go through fine
- if I fire up redbot with
redbot/bin$ redbot http://localhost:9546I get the following exception:
redbot http://localhost:9546
Traceback (most recent call last):
File "/usr/local/bin/redbot", line 99, in <module>
main()
File "/usr/local/bin/redbot", line 91, in main
thor.run()
File "/usr/local/lib/python2.7/dist-packages/thor/loop.py", line 113, in run
self._run_fd_events()
File "/usr/local/lib/python2.7/dist-packages/thor/loop.py", line 297, in _run_fd_events
self._fd_event(event, fileno)
File "/usr/local/lib/python2.7/dist-packages/thor/loop.py", line 164, in _fd_event
self._fd_targets[fd].emit(event)
File "/usr/local/lib/python2.7/dist-packages/thor/events.py", line 109, in emit
e(*args)
File "/usr/local/lib/python2.7/dist-packages/thor/tcp.py", line 163, in handle_read
self.emit('data', data)
File "/usr/local/lib/python2.7/dist-packages/thor/events.py", line 109, in emit
e(*args)
File "/usr/local/lib/python2.7/dist-packages/thor/http/common.py", line 166, in handle_input
rest = self._parse_headers(instr)
File "/usr/local/lib/python2.7/dist-packages/thor/http/common.py", line 378, in _parse_headers
= self._parse_fields(header_lines, True)
File "/usr/local/lib/python2.7/dist-packages/thor/http/common.py", line 340, in _parse_fields
self.input_error(error.DuplicateCLError())
File "/usr/local/lib/python2.7/dist-packages/thor/http/client.py", line 389, in input_error
self.emit('error', err)
File "/usr/local/lib/python2.7/dist-packages/thor/events.py", line 109, in emit
e(*args)
File "/usr/local/lib/python2.7/dist-packages/redbot/fetch.py", line 280, in _response_error
self.finish_task()
File "/usr/local/lib/python2.7/dist-packages/redbot/fetch.py", line 112, in finish_task
self.done_cb()
File "/usr/local/bin/redbot", line 88, in done
formatter.finish_output()
File "/usr/local/lib/python2.7/dist-packages/redbot/formatter/text.py", line 169, in finish_output
BaseTextFormatter.finish_output(self)
File "/usr/local/lib/python2.7/dist-packages/redbot/formatter/text.py", line 98, in finish_output
raise AssertionError, "Unidentified incomplete response error."
AssertionError: Unidentified incomplete response error.
At this point, it's obvious that the DuplicateCLError "bubbles" all the way to the formatter which doesn't know how to handle it so it just throws an exception.
If I change the test server to return duplicate headers with the same value (5), redbot doesnt crash and correctly reports
* Security:
* Multiple Content-Length headers with same values should be avoided in a response.
I guess that the desired behavior in both cases is that redbot doesn't crash :). In case of duplicate CL headers with different values -- redbot/thor should use the second CL length and continue using that, right? That would mean that the thor error needs to be handled at some point in redbot so that thor can continue normally (currently, it stops when it reports the input_error, right?). Do you prefer that the error be handled at some specific place in redbot? Perhaps in fetch.py in the _response_error method, so that response_error returns normally if the error is a DuplicateCL error and thus permits thor to continue executing (rather than calling self.done() and self.finish_task())?
Any other comments regarding other parts of my changes are welcome.
Cheers!
p.s. redbot and thor are great! :) appreciating them even more after digging through the implementation.
Cool!
My thinking was that redbot would ignore the Thor error (in fetch) and catch it (and others, of course) in headers/content_length -- make sense?
Hi Mark,
yeah, that sounds ok. Here's part two of the fix for multiple CL headers - https://github.com/izuzak/redbot/commit/7f808976979743c4ff9882fd8d814db3d70ab929). Basically, I just catch the relevant thor errors so that redbot doesn't crash. The description below assumes that https://github.com/mnot/thor/pull/21 and https://github.com/mnot/thor/pull/22 have been applied to thor.
Using this change and the first part which I linked before (https://github.com/izuzak/redbot/commit/40b85b037275306ebf6cdb9d1958a0ab1ecee974), this is how redbot behaves when multiple CL headers are present (easily tested using the testing server which was included in the previous changeset):
a) in case that the value of the last CL header is greater than len(body), redbot reports:
* General:
* (error level) This response's Content-Length header is incorrect.
* Security:
* (error level) Only one Content-Length header is allowed in a response.
b) in case that the value of the last CL header is equal to len(body), redbot reports:
* General:
* The Content-Length header is correct.
* Security:
* (error level) Only one Content-Length header is allowed in a response.
c) in case that the values of all CL headers are equal to len(body), redbot reports:
* General:
* The Content-Length header is correct.
* Security:
* (warn level) Multiple Content-Length headers with same values should be avoided in a response.
d) in case that the values of all CL headers are equal but larger than len(body), redbot reports:
* General:
* (error level) This response's Content-Length header is incorrect.
* Security:
* (warn level) Multiple Content-Length headers with same values should be avoided in a response.
e) in case that the values of all CL headers are equal but smaller than len(body), redbot reports:
* General:
* (error level) This response's Content-Length header is incorrect.
* Security:
* (warn level) Multiple Content-Length headers with same values should be avoided in a response.
f) in case that the value of the last CL headers is smaller than len(body), redbot reports:
* General:
* (error level) This response's Content-Length header is incorrect.
* Security:
* (error level) Only one Content-Length header is allowed in a response.
The cases in which there is a single CL header have the expected output.
If this change and the previous change (https://github.com/izuzak/redbot/commit/40b85b037275306ebf6cdb9d1958a0ab1ecee974) seem ok with you, I can try and submit a clean pull request with both changes (taking into account the changes that happened in redbot in the meantime).