squid icon indicating copy to clipboard operation
squid copied to clipboard

Bug 4261: Reject configurations with conflicting http(s)_ports

Open hfisiteimprove opened this issue 3 years ago • 11 comments

Here, the two listening port configurations are in conflict when they specify the same listening IP:port address.

hfisiteimprove avatar Mar 31 '21 07:03 hfisiteimprove

Can one of the admins verify this patch?

squid-prbot avatar Mar 31 '21 07:03 squid-prbot

@hfisiteimprove, thank you very much for working on this useful improvement! We do want to reject configurations with conflicting port configurations.

I share the primary concern voiced by Amos -- this PR should check for conflicting listening addresses rather than port names or protocols. Please refactor accordingly.

BTW, rejecting duplicated *_port name=... values is also a good idea, but that is not what this PR focus and primary value are.

Amos: 1. special IP 0.0.0.0 conflicts with any IPv4 on the same port number

I assume the following configuration should accepted as valid in some environments. @yadij, please correct me if I am wrong.

# OK: Special http_port for local cache manager queries only
http_port 127.0.0.1:3128 ...
# OK: We listen  for traffic on all other available IP addresses, whatever they are
http_port 0.0.0.0:3128 ...

If my assumption is correct, then listening conflicts between specific addresses and wildcards should be out of this PR scope because many of them cannot be reliably detected at configuration time without a lot of complex (and unnecessary IMO) work. They should be detected at runtime, resulting in Squid startup death, but that is a different issue.


P.S. @hfisiteimprove, please be extra careful with the Ip::Address APIs. A few of them are badly designed and will confuse uninitiated developers. For example, do not assume that a C++ equality operator tests for Ip::Address object equality. Always check the actual method implementation before relying on its expected or even documented semantics.

rousskov avatar Mar 31 '21 15:03 rousskov

I polished PR title/description to fix formatting and to document the primary effect of the change, but they may need more work depending on how we end up defining "conflicting" for this PR purposes.

rousskov avatar Mar 31 '21 15:03 rousskov

# OK: Special http_port for local cache manager queries only
http_port 127.0.0.1:3128 ...
# OK: We listen  for traffic on all other available IP addresses, whatever they are
http_port 0.0.0.0:3128 ...

Hmm. Its a long twisted path of indirect flag settings, but it looks like this should be working for most Squid builds. However the existence of this bug 4261 and many user complaints indicates that something is not happening like it should.

To open a port like that requires that the explicit-IP:port listener be opened and sets SO_REUSEADDR before the wildcard-IP:port is opened.

Doing a deep code inspection:

  • comm_open_listener() calls comm_openex() with the configured port number.
  • comm_openx() calls commSetReuseAddr() when flag COMM_REUSEADDR set or given a port number.
  • commSetReuseAddr() uses opt_reuseaddr to decide whether to set the flag or do nothing.
  • opt_reuseaddr is enabled by default globally, except when "squid -R" is used on the command line.

yadij avatar Mar 31 '21 21:03 yadij

IMO, it does not matter (for this PR) whether sharing specific and wildcard addresses works today (including "works well", "does not work at all", "only works on Windows", etc.). This PR is about rejecting things that can never ever work and making that decision based on basic configuration analysis alone. In other words, roughly speaking, this is a "configuration validation" PR rather than a "listening code improvements" or "deep analysis of socket flags and bind() APIs" PR.

According to my quick script tests, recollection, and googling:

  • specific/wildcard address sharing that I described works (without any special flags) on modern Windows;
  • specific/wildcard address sharing that I described does not work (with or without SO_REUSEADDR) on Unix
  • SO_REUSEADDR is irrelevant (on Unix) -- it is meant for a different kind of reuse; you may have been thinking about SO_REUSEPORT

However, again, all these low-level details do not matter here IMO -- we do not need to spend time developing consensus regarding them. Comparing IPs:port addresses is what we want AFAICT.

# OK: Special http_port for local cache manager queries only
http_port 127.0.0.1:3128 ...
# OK: Trying to listen for traffic on all other available IP addresses, whatever they are
http_port 0.0.0.0:3128 ...
# Error: Two _port definitions with the same IP:port address (0.0.0.0)
http_port 3128 ...

rousskov avatar Apr 01 '21 16:04 rousskov

IMO, it does not matter (for this PR) whether sharing specific and wildcard addresses works today (including "works well", "does not work at all", "only works on Windows", etc.). This PR is about rejecting things that can never ever work and making that decision based on basic configuration analysis alone.

IMO it matters because we do get user requests to configure Squid that way and the use-case seems reasonable at face-value. If it was at some point supposed to work we should not arbitrarily forbid it here without at least allowing for a future fix.

In other words, roughly speaking, this is a "configuration validation" PR rather than a "listening code improvements" or "deep analysis of socket flags and bind() APIs" PR.

Agreed on that being the scope of changes.

According to my quick script tests, recollection, and googling:

* specific/wildcard address sharing that I described [works](https://stackoverflow.com/a/33622121) (without any special flags) on modern Windows;

Then your second config line below should be "OK" on Windows (native?) builds ...

* specific/wildcard address sharing that I described does not work (with or without SO_REUSEADDR) on Unix

... and "Error" on all other builds.

* SO_REUSEADDR is [irrelevant](https://blog.cloudflare.com/its-crowded-in-here/#quiz-3) (on Unix) -- it is meant for a different kind of reuse; you may have been thinking about SO_REUSEPORT

No the stackexchange discussion and references I found the other day were definitely only about SO_REUSEADDR. I cannot find them again today to check if listen() was part of it.

However, again, all these low-level details do not matter here IMO -- we do not need to spend time developing consensus regarding them. Comparing IPs:port addresses is what we want AFAICT.

# OK: Special http_port for local cache manager queries only
http_port 127.0.0.1:3128 ...
# OK: Trying to listen for traffic on all other available IP addresses, whatever they are
http_port 0.0.0.0:3128 ...
# Error: Two _port definitions with the same IP:port address (0.0.0.0)
http_port 3128 ...

yadij avatar Apr 03 '21 11:04 yadij

What I am thinking the check logic should be is:

 for (const auto old ; HttpPortList) {
    if (s->port != old->port)
        continue; // OK: different ports

    if (s->ip == old->ip)
       throw ...; // Error: exactly duplicate IP:port

    if (old->ip.IsAnyAddr())
       throw ...; // Error: explicit IP:port following wildcard line with same port

    if (s->ip.IsAnyAddr() && s->ip == old->ip) {
#if SQUID_WINDOWS
      continue;
#else
      throw ...; // Error: IP + wildcard with same port is not supported
#endif
    }

    // else: OK different IP's and no special-case issues
 }

yadij avatar Apr 03 '21 11:04 yadij

  • specific/wildcard address sharing that I described works (without any special flags) on modern Windows;
  • specific/wildcard address sharing that I described does not work (with or without SO_REUSEADDR) on Unix

Then your second config line below should be "OK" on Windows (native?) builds ... ... and "Error" on all other builds.

It should be OK everywhere, from this PR detection/code point of view. What happens later, at bind() time is out of scope here. The low-level details of what a given platform and a given set of flags can or cannot do are out of this PR scope. This PR should not try to detect every bind() conflict on every platform, in every configuration. Detecting all such conflicts correctly would be very difficult, and the end result would be worse than if we keep this PR focused on simple high-level platform-agnostic checks. The checks we should be after here will not guarantee future bind() successes! They will let a few corner cases slip through, and that is OK.

The only slightly tricky part that I foresee in this PR scope is detecting the difference between "any IP", "any IPv4", and "any IPv6" addresses. That high-level detection is in scope.

# OK: Special http_port for local cache manager queries only
http_port 127.0.0.1:3128 ...

# OK: Trying to listen for traffic on all other available IPv4 addresses, whatever they are
http_port 0.0.0.0:3128 ...

# OK: Trying to listen for traffic on all IPv6 addresses, whatever they are
http_port :::3128 ...

# Error: Multiple _port definitions with the same IP:port address (0.0.0.0)
# Error: Multiple _port definitions with the same IP:port address (::)
http_port 3128 ...

rousskov avatar Apr 03 '21 14:04 rousskov

    if (old->ip.IsAnyAddr())
       throw ...; // Error: explicit IP:port following wildcard line with same port

I expect the decision here to be slightly more complex to allow addresses in different IP families. However, I have not checked whether this IsAnyAddr() check covers all those cases -- I just doubt it does.

#if SQUID_WINDOWS
...

Out of scope. Even on Windows, the correct decision depends on many factors. Let bind() take care of all that low-level complexity.

rousskov avatar Apr 03 '21 14:04 rousskov

The situation is simple enough that some OS permit the explicit+wildcard case. Most do not.

The policy solution is just as simple. Reject the case with "not supported" message (as opposed to "invalid config" or such). Anyone who wants it supported can PR the special-case detection later for their desired OS.

yadij avatar Apr 03 '21 23:04 yadij

The situation is simple enough that some OS permit the explicit+wildcard case. Most do not.

The some/most distinction does not simplify anything AFAICT.

Detecting whether the current environment permits the explicit+wildcard case is not simple. It is not just "Windows vs non-Windows" AFAICT. That is why it should not be done in this supposed-to-be-simple high-level code IMO. However, if somebody writes and tests a high-quality implementation that correctly classifies "permitting" and "not permitting" cases, then I would not object to it going in. I expect it will take a lot of effort (on all sides) to do this well, and that energy is better spent elsewhere IMO.

The policy solution is just as simple. Reject the case with "not supported" message

That sounds like the opposite of what we should do because quickly (but incorrectly) banning a supported use case is much worse than delaying the (correct) ban of unsupported cases.

IMO, since correct detection of those "permitting" environments is difficult and the bind() call will reliably reject unsupported environments later, the configuration checker should simply allow all explicit+wildcard cases instead of working hard to detect whether the current environment is "permitting" those cases.

rousskov avatar Apr 04 '21 22:04 rousskov