cli
cli copied to clipboard
Host header output incorrect when explicitly specifying the default port
Hi,
if you do something like http --print hH http://localhost:80/
you get this with :80
in the host header as output:
GET / HTTP/1.1
Accept: */*
Accept-Encoding: gzip, deflate
Connection: keep-alive
Host: localhost:80
User-Agent: HTTPie/2.4.0
But if you look at it with strace (strace -f -v -s 256 -e sendto http --print hH http://localhost:80/
) you get
sendto(3, "GET / HTTP/1.1\r\nHost: localhost\r\nUser-Agent: HTTPie/2.4.0\r\nAccept-Encoding: gzip, deflate\r\nAccept: */*\r\nConnection: keep-alive\r\n\r\n", 130, 0, NULL, 0) = 130
So the port was removed in the actual request. What httpie is showing is wrong in this case. It works with non-default ports, so it's just an issue with http + 80 and https + 443.
It confused me when I debugged a custom host header matching logic where I forgot to match/ignore the port.
It should both match, so there's no confusion what really is sent.
Workaround: If you want to get the port to the host header, you have to use the header request item argument.
Thanks for the report, @thetuxkeeper! The reported vs. actual Host
value inconsistency is a bug.
The default protocol port normalization is quite common. But we should document it together with the possible explicit Host
overwrite. (Just need to make sure the overwrite works with https://
URLs as well.).
Relevant spec:
A "host" without any trailing port information implies the default port for the service requested (e.g., "80" for an HTTP URL). — https://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.23
@jakubroztocil : Thanks for the fast response! Yes, the removal of the default port is usually expected. Just the inconsistency bug was irritating.
I was debugging "strange" requests with the default port in the Host
header which seems to be quite common when using proxies or something like that (at least it's nothing too uncommon). I couldn't find the bug until a colleague was testing and reproducing it with curl ...
A simple fix can be applied, I think. to httpie/cli/argparser.py
UPDATE: It breaks on IPv6. I haven't checked that out yet.
❯ git diff --cached argparser.py
diff --git a/httpie/cli/argparser.py b/httpie/cli/argparser.py
index 720e70b..a8963f9 100644
--- a/httpie/cli/argparser.py
+++ b/httpie/cli/argparser.py
@@ -6,6 +6,7 @@ import sys
from argparse import RawDescriptionHelpFormatter
from textwrap import dedent
from urllib.parse import urlsplit
+from urllib.parse import urlparse
from requests.utils import get_netrc_auth
@@ -133,6 +134,14 @@ class HTTPieArgumentParser(argparse.ArgumentParser):
else:
self.args.url = scheme + self.args.url
+ urlscheme = urlparse(self.args.url).scheme
+ urlport = urlparse(self.args.url).port
+
+ if urlscheme == 'https' and urlport == 443 \
+ or urlscheme == 'http' and urlport == 80:
+ self.args.url = self.args.url.replace( ":" + str(urlport), '')
+
+
# noinspection PyShadowingBuiltins
def _print_message(self, message, file=None):
# Sneak in our stderr/stdout.