caddy icon indicating copy to clipboard operation
caddy copied to clipboard

Firefox requests hanging with Caddy v2.7.5

Open toby-griffiths opened this issue 7 months ago • 9 comments

We use Caddy for our API Platform application with Cloudflare DNS, Mercure & Vulcain plugins, but just recently we started having issues with out API in Firefox.

We are using a service worker to intercept fetch requests and attach the Authorize header before sending it. Something is happening during this process that's causing the API to just hang, and eventually the browser reports a failure to load. The console reports a CORS error, however can see the OPTIONS requests going back & forth just fine, and this works fine in all other browsers.

At first we thought it might be related to #5096, however we disabled http3 in the Caddyfile, but still experienced the issue.

Last night we had to extract the Caddy binary from a previous Docker image (v2.7.4 h1:J8nisjdOxnYHXlorUKXY75Gr6iBfudfoGhrJ8t7/flI=) and copy that into our caddy image to get things working again. This was the only change we had to make to resolve the issue, so it would appear that there is something wrong with the v2.7.5 image that only seems to affect Firefox. This previous image was build on 3rd Oct.

Here's our service worker Javascript (compiled from Typescript…

let s = "";
self.addEventListener("fetch", (e) => {
  const n = self.location.hostname.substring(
    0,
    self.location.hostname.length - 14
  ), o = "https://{tenant}.apiv2.goodcrm.co.uk:8843".replace("{tenant}", n);
  if (e.request.url.startsWith(o)) {
    const c = async () => {
      if (e.request.headers.has("Authorization"))
        return fetch(e.request);
      const i = e.request.clone();
      if (s.length > 0) {
        const t = a(e.request, s);
        try {
          const r = await fetch(t);
          if (r.status !== 401)
            return r;
        } catch (r) {
          return Promise.reject(r);
        }
      }
      try {
        const t = await fetch(self.location.protocol + "//" + self.location.host + "/api/getApiToken");
        if (t.status === 200)
          s = (await t.json()).token;
        else
          return Promise.reject(t);
      } catch (t) {
        return Promise.reject(t);
      }
      try {
        const t = a(i, s);
        return await fetch(t);
      } catch (t) {
        return Promise.reject(t);
      }
    };
    return e.respondWith(c());
  } else
    e.request.url.includes("logout") && (s = "");
  return e.respondWith(fetch(e.request));
});
self.onmessage = (e) => {
  e.data === "claimMe" && self.clients.claim();
};
function a(e, n) {
  const o = new Headers(e.headers);
  return o.set("Authorization", `Bearer ${n}`), new Request(e, {
    headers: o
  });
}

During our testing of the issue we discovered that if, instead of cloning the request and adding auth headers, we just tell the worker make an empty fetch to the same url, then that requests completes fine, but the fails with a 401 because of the missing Authorization header. So perhaps this is an issue with the adding of the Authorization header somehow?

Here's the download link we use…

https://caddyserver.com/api/download?os=linux&arch=amd64&p=github.com/dunglas/mercure/caddy&p=github.com/dunglas/vulcain/caddy&p=github.com/caddy-dns/cloudflare

And here is the version reported by this download as of last nigth…

v2.7.5 h1:HoysvZkLcN2xJExEepaFHK92Qgs7xAiCFydN5x5Hs6Q=

I will also raise an issue with Firefox regarding this, but wanted to raise here since this works fine with v2.7.4 h1:J8nisjdOxnYHXlorUKXY75Gr6iBfudfoGhrJ8t7/flI=.

toby-griffiths avatar Nov 14 '23 10:11 toby-griffiths

Will add the caddyfile very shortly as just going into a meeting.

toby-griffiths avatar Nov 14 '23 10:11 toby-griffiths

Here's the Caddyfile we were originally using…

{
    # Debug
    #debug
}

{$SERVER_NAME}

log

tls [redacted email address] {
  dns cloudflare [redacted token]
}

# Matches requests for HTML documents, for static files and for Next.js files,
# except for known API paths and paths with extensions handled by API Platform
@pwa expression `(
        {header.Accept}.matches("\\btext/html\\b")
        && !{path}.matches("(?i)(?:^/docs|^/webhooks/|^/graphql|^/bundles/|^/_profiler|^/_wdt|\\.(?:json|html$|csv$|ya?ml$|xml$))")
    )
    || {path} == "/favicon.ico"
    || {path} == "/manifest.json"
    || {path} == "/robots.txt"
    || {path}.startsWith("/_next")
    || {path}.startsWith("/sitemap")`

route {
    root * /srv/api/public
    mercure {
        # Transport to use (default to Bolt)
        transport_url {$MERCURE_TRANSPORT_URL:bolt:///data/mercure.db}
        # Publisher JWT key
        publisher_jwt {env.MERCURE_PUBLISHER_JWT_KEY} {env.MERCURE_PUBLISHER_JWT_ALG}
        # Subscriber JWT key
        subscriber_jwt {env.MERCURE_SUBSCRIBER_JWT_KEY} {env.MERCURE_SUBSCRIBER_JWT_ALG}
        # Allow anonymous subscribers (double-check that it's what you want)
        anonymous
        # Enable the subscription API (double-check that it's what you want)
        subscriptions
        # Extra directives
        {$MERCURE_EXTRA_DIRECTIVES}
    }
    vulcain
    push

    # Add links to the API docs and to the Mercure Hub if not set explicitly (e.g. the PWA)
    header ?Link `</docs.jsonld>; rel="http://www.w3.org/ns/hydra/core#apiDocumentation", </.well-known/mercure>; rel="mercure"`
    # Disable Google FLOC tracking if not enabled explicitly: https://plausible.io/blog/google-floc
    header ?Permissions-Policy "interest-cohort=()"

    # Comment the following line if you don't want Next.js to catch requests for HTML documents.
    # In this case, they will be handled by the PHP app.
    reverse_proxy @pwa http://{$PWA_UPSTREAM}

    php_fastcgi php:9000
    encode zstd gzip
    file_server
}

We also tried adding these lines just above the {$SERVER_NAME} in an attempt to fix, as suggeted in #5096 …

{
  servers {
    protocols h1 h2
  }
}

toby-griffiths avatar Nov 14 '23 10:11 toby-griffiths

Enable debug logging and show logs from those hanging requests.

Try to make a request with curl -v which shows the problem. You can copy a curl command from the browser's network tab. Try to reduce the curl command to only the relevant bits (you can remove headers and try again until it starts working, then you know the last change you made was the likely culprit).

francislavoie avatar Nov 14 '23 10:11 francislavoie

Also extract then share the process profile using the instructions here: Profiling Caddy

mohammed90 avatar Nov 14 '23 10:11 mohammed90

Thanks for the info. Will try to find a moment to share this additional info.

toby-griffiths avatar Nov 15 '23 12:11 toby-griffiths

Not sure if it's the same thing, but I'm recently having this issue of "hanging requests" as well. I'm using Caddy 2.7.5 and Firefox. Restarting Caddy helps, but the problem returns after a short while.

muety avatar Nov 19 '23 17:11 muety

Not sure if it's the same thing, but I'm recently having this issue of "hanging requests" as well. I'm using Caddy 2.7.5 and Firefox. Restarting Caddy helps, but the problem returns after a short while.

As Francis and I shared earlier, we need debug logs and profile.

mohammed90 avatar Nov 19 '23 17:11 mohammed90

Turned out to be an issue with Firefox, rather than with Caddy for me. In Chrome or using curl, I couldn't reproduce the problem. Also, I figured the requests would never actually hit the server, but start hanging even before.

I'm using Caddy's push directive to server-push some static assets to the browser on initial page load and it seems like that was problematic in Firefox. With that push mechanism enabled, some requests would get stuck and never return. Also, all subsequent requests to the same domain would hang forever from there on. I'm pretty sure this must be a bug in Firefox, but couldn't find any related reports.

muety avatar Dec 26 '23 08:12 muety

Sorry for the lack of update. We've currently mitigate this by fixing the Caddy server at v2.6.4, but want to upgrade, so will try to re-test and provide more info as soon as I can find a moment. The joys of working for a small start-up!

toby-griffiths avatar Feb 06 '24 14:02 toby-griffiths

@toby-griffiths Thanks for circling back -- it's been a couple more months now, any more information?

It sounds like a Firefox bug; and without more information, there's not much we can do to act on this. So I'll close this for now, but if new information surfaces that points to a problem in our code base, we can reopen and investigate.

mholt avatar Apr 24 '24 18:04 mholt