jetty.project icon indicating copy to clipboard operation
jetty.project copied to clipboard

HostPort needs updates for spec compliance

Open joakime opened this issue 3 years ago • 10 comments

Jetty version(s) 9.4+

Java version/vendor (use: java -version) All

OS type/version Alll

Description

Currently HostPort allows odd authorities that make no sense.

Host: :9999 
Host: -
Host: -:8888
Host: *
Host: *:2222
Host: *.eclipse.org

I think HostPort should validate the host portion a bit more, to reject these nonsense hosts with a 400 Bad Request.

The spec for HTTP at https://datatracker.ietf.org/doc/html/rfc7230#section-5.4 Says the uri-host for http is detailed in https://datatracker.ietf.org/doc/html/rfc7230#section-2.7.1 Which points to host in https://datatracker.ietf.org/doc/html/rfc3986#section-3.2.2

which has the following ABNF ...

  uri-host    = <host, see [RFC3986], Section 3.2.2>
  host        = IP-literal / IPv4address / reg-name
  IP-literal  = "[" ( IPv6address / IPvFuture  ) "]"
  IPv6address =                            6( h16 ":" ) ls32
                  /                       "::" 5( h16 ":" ) ls32
                  / [               h16 ] "::" 4( h16 ":" ) ls32
                  / [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
                  / [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
                  / [ *3( h16 ":" ) h16 ] "::"    h16 ":"   ls32
                  / [ *4( h16 ":" ) h16 ] "::"              ls32
                  / [ *5( h16 ":" ) h16 ] "::"              h16
                  / [ *6( h16 ":" ) h16 ] "::"

  ls32        = ( h16 ":" h16 ) / IPv4address
                  ; least-significant 32 bits of address

  h16         = 1*4HEXDIG
                  ; 16 bits of address represented in hexadecimal
  IPvFuture   = "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" )
  IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet
  dec-octet   = DIGIT                 ; 0-9
                  / %x31-39 DIGIT         ; 10-99
                  / "1" 2DIGIT            ; 100-199
                  / "2" %x30-34 DIGIT     ; 200-249
                  / "25" %x30-35          ; 250-255
  reg-name    = *( unreserved / pct-encoded / sub-delims )
  unreserved  = ALPHA / DIGIT / "-" / "." / "_" / "~"
  pct-encoded = "%" HEXDIG HEXDIG
  sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="

Optionally, this validation could exist as a RejectInvalidAuthorityCustomizer (like proposed in PR #7251).

joakime avatar Dec 13 '21 20:12 joakime

RFC3986 has been updated for IPv6 with zone literals in https://datatracker.ietf.org/doc/html/rfc6874

And support for IPv4 within IPv6 is documented in the obsolete https://tools.ietf.org/html/rfc2732#section-2 I cannot find IPv4 within IPv6 listed as a supported feature in up to date RFCs about http or URI specs.

joakime avatar Dec 14 '21 01:12 joakime

Per reg-name the characters not allowed are ...

 / \ : @ ^ [ ] { } < > # | " `

joakime avatar Dec 14 '21 02:12 joakime

How far do we go?

Are these considered valid?

*.eclipse.org:888
[this:is:odd]:888
how:about:this:too

joakime avatar Dec 14 '21 02:12 joakime

Opened PR #7279 to conform to specs, and perform Host validation that just ensures it follows reg-name (simply by rejecting host names that violate the generous reg-name scope of characters)

joakime avatar Dec 14 '21 14:12 joakime

Along with the general cleanup, we should standardize our use of "no port".

Since we go into and out of the java.net.URI class often, we should use their definition of "no port" and have it be a constant of value -1 in HostPort like HostPort.NO_PORT.

A value of 0 (zero) is something I think we should avoid, as it's not a port we can connect to.

Here's a demo of the behavior of java.net.URI ...

package uri;

import java.net.URI;
import java.net.URISyntaxException;

public class UriBadPortTest
{
    public static void main(String[] args)
    {
        dumpUri("http://host:0/path");
        dumpUri("http://host:-1/path");
        dumpUri("http://host:-2/path");
        dumpUri("http://host:/path");
        dumpUri("http://host: 777 /path");
    }

    private static void dumpUri(String raw)
    {
        try
        {
            URI uri = new URI(raw);
            System.out.printf("Raw \"%s\" -> host:\"%s\" port:%d%n", raw, uri.getHost(), uri.getPort());
        }
        catch (URISyntaxException e)
        {
            System.out.printf("Invalid URI \"%s\": (%s) %s%n",
                raw, e.getClass().getName(), e.getMessage()
            );
        }
    }
}

Results in the output ...

Raw "http://host:0/path" -> host:"host" port:0
Raw "http://host:-1/path" -> host:"null" port:-1
Raw "http://host:-2/path" -> host:"null" port:-1
Raw "http://host:/path" -> host:"host" port:-1
Invalid URI "http://host: 777 /path": (java.net.URISyntaxException) Illegal character in authority at index 7: http://host: 777 /path

joakime avatar Jul 14 '22 13:07 joakime

Some other things that HostPort doesn't support very well ...

(Presented as tests of valid authorities in the HostPortTest)

  • IDN values
              // IDN example
              Arguments.of("пример.рф", "пример.рф", null),
              Arguments.of("пример.рф:8888", "пример.рф", 8888)
    
  • IPv6 with zone identifiers
              // Examples of IPv6 with zone identifier from https://datatracker.ietf.org/doc/html/rfc6874
              Arguments.of("[fe80::a%en1]", "[fe80::a%en1]", null),
              Arguments.of("[fe80::a%25ee1]", "[fe80::a%25ee1]", null),
              Arguments.of("[fe80::a%en1]:7777", "[fe80::a%en1]", 7777),
              Arguments.of("[fe80::a%25ee1]:7777", "[fe80::a%25ee1]", 7777),
    
  • Scheme based normalization
              // scheme based normalization https://datatracker.ietf.org/doc/html/rfc3986#section-6.2.3
              Arguments.of("host:", "host", null),
              Arguments.of("127.0.0.1:", "127.0.0.1", null),
    
  • Whitespace in weird places
              // Whitespace
              Arguments.of("host   ", "host", null),
              Arguments.of("    host   ", "host", null),
              Arguments.of("host   :777", "host", 777),
              Arguments.of("    host   :777", "host", 777),
              Arguments.of("    host   :   777", "host", 777),
              Arguments.of("    host   :777   ", "host", 777),
              Arguments.of("    host   :   777   ", "host", 777),
    

joakime avatar Jul 14 '22 13:07 joakime

I also think that all of the following should be reported as Invalid Authorities.

    private static Stream<Arguments> invalidAuthorityProvider()
    {
        return Stream.of(
                // Empty / Null / Blank authority
                null,
                "", // TODO: if addressing edge case with absolute-uri and empty Host header (Issue #7278)
                "    ", // TODO: if addressing edge case with absolute-uri and empty Host header (Issue #7278)
                // Invalid Ports
                "-:-",
                "host:xxx",
                "127.0.0.1:xxx",
                "[0::0::0::0::1]:xxx",
                "host:-80",
                "127.0.0.1:-80", // negative port
                "[0::0::0::0::1]:-80", // negative port
                "127.0.0.1:65536", // port too big
                "jetty.eclipse.org:88007386429567428956488", // port too big
                "jetty.eclipse.org:22,333", // port with commas
                // Empty / Null / Blank Hosts
                ":",
                ":44",
                "::",
                // Bad quoting
                "'eclipse.org:443'",
                "eclipse.org:443\"", // bad end quoting that made it through
                "':88'",
                // Bad Host Names (invalid IP-Literals)
                "[jetty.eclipse.org]:80", // invalid/mimic ipv6 with port
                "[sheep:cheese:of:rome]:80", // invalid/mimic ipv6 with port
                "[pecorino:romano]", // invalid/mimic ipv6 without port
                // Bad Host Names (invalid reg-name) - note: an invalid IPv4address looks like a reg-name
                "this:that:or:the:other.com:222",  // multiple ':' with port
                "and:also:th.is",  // multiple ':' without port
                // reg-name identified invalid printable characters - / \ : @ ^ [ ] { } < > # | " `
                "a/slash.com",
                "a\\backslash.edu",
                "[email protected]",
                "a^caret.net",
                "some[arbitrary]brackets.org",
                "more{curly}braces.io",
                "html<elements>here.au",
                "hash#octothor.pe",
                "ceci-n'est-pas-une|pipe.fr",
                "we-sell-\"quotes\".com",
                "back`ticks`bbq.au",
                // reg-name invalid control characters
                "how\ttabulous.net",
                "null\u0000.com",
                "bell-\u0007-tolls.edu",
                "del-\u007F-mar.au"
            )
            .map(Arguments::of);
    }

    @ParameterizedTest
    @MethodSource("invalidAuthorityProvider")
    public void testInvalidAuthority(String authority)
    {
        assertThrows(IllegalArgumentException.class, () ->
        {
            new HostPort(authority);
        });
    }

joakime avatar Jul 14 '22 13:07 joakime

This issue has been automatically marked as stale because it has been a full year without activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jul 15 '23 00:07 github-actions[bot]