Routing jobs by something other than path
Routing jobs by the path can be painful and harmful. Here's why:
For instance, if we were about to deploy WordPress to Racetrack, we'd need to deploy it at the non-root path, ie. /pub/job/wordpress-job/0.0.2/. This is required by Racetrack due to path routing done by PUB.
However, WordPress doesn't seem to work with the non-root base path (or at least I wasn't able to achieve it).
This is not about just the WordPress. Drupal commits the same sin. I was finally able to deploy it to Racetrack by doing some weird HTML output rewriting, which is unreliable and can't be generalized.
Furthermore, jobs can be accessed either by its exact version /pub/job/adder/0.0.2/, alias /pub/job/adder/latest/ or a wildcard /pub/job/adder/0.0.x/ so what we really need is to serve a job at wildcarded /pub/job/adder/*/ URL path. Again, this is not supported by common applications.
The only cure I see is not to do path routing at all. Instead jobs could be routed by hostname.
That is: hostname adder-0-0-2.pub.dev-racetrack-cluster.example.com/, which is a part of HTTP request, could be parsed by PUB.
I know it requires a lot of effort to be done by the infrastructure providers (k8s ingress, wildcarded HTTPS certificates) and local testing might become more complex with the addition of a local DNS server. However, Racetrack would become wide open to nearly any kind of application.
(This is not urgent, it's rather nice-to-have feature)
Path rewriting breaks front-end apps
Here's a detailed explanation why the problem exists due to path rewriting. It is common to many front-end apps, like Drupal, WordPress or pgAdmin.
- A job is hosted by Pub at base URL:
/pub/job/drupal-job/0.0.2/(to distinguish it from other jobs) - A target application (eg. Drupal) awaits for the requests at the root path, so docker proxy job rewrites the URL and trims down the prefix, transforming
/pub/job/drupal-job/0.0.2/index.htmlinto/index.htmland it passes the request to the target container. - A target frontend app doesn't know it's being proxied and generates absolute paths in its HTML content:
<link rel="stylesheet" href="/assets/style.css" /> - The HTML page goes back to user's browser, which tries to load a CSS file, so it makes another request to Pub for
/assets/style.css - Pub rejects that request with
404 Not Foundas it's invalid job URL.
Note that while the issue occurs in almost any kind of front-end apps, there are exceptions (like Sphinx or HUGO) that are "proxy-friendly" and work well thanks to using relative paths inside its HTML/JS contents.
In general, to address this issue, we need to find a way of putting additional information to HTTP request that allows Racetrack to distinguish a job, and dispatch the request to appropriate place. HTTP request looks usually like this:
GET /pub/job/adder/0.0.3/docs HTTP/2
Host: racetrack.dev.example.com
accept-encoding: deflate, gzip, br, zstd
accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7
accept-language: en-US,en;q=0.9,pl;q=0.8,da;q=0.7
cache-control: max-age=0
cookie: racetrack_sessionid=***; X-Racetrack-Auth=***
sec-ch-ua: "Google Chrome";v="119", "Chromium";v="119", "Not?A_Brand";v="24"
sec-ch-ua-mobile: ?0
sec-ch-ua-platform: "Linux"
sec-fetch-dest: document
sec-fetch-mode: navigate
sec-fetch-site: same-origin
sec-fetch-user: ?1
upgrade-insecure-requests: 1
user-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36
Possible options:
- Path: "/pub/job/adder/0.0.3/docs" - this is a current approach, it isn't perfect though, as described above
- Host name: "Host: adder-0-0-2.pub.dev-racetrack-cluster.example.com" - has certificate difficulties as described above
- Host port: "Host: racetrack.dev.example.com:8565" - Each job will get a unique port number. Should work even with HTTPS, but Infrastructure target has to support opening new ports on demand (eg. new port in Kubernetes' Ingress Controller, plus opening new port on Cloudflare and the firewalls). One drawback is that user doesn't know what job he's really looking at as all he sees is just a random port number 8565.
- Cookie: "cookie: job=adder:0.0.3" - Bad idea. Cookie is persistent between requests, but it won't allow opening more than one tab in a browser as it's shared.
- Custom header: "X-Job: adder:0.0.3" - Bad idea. This header will be cleared out after making a new request through the browser.
- Additional Query param: "/index.html?racetrack_job=adder:0.0.3" - It might break when a browser redirects you to the other absolute URL, clearing out this query param
Currently, the best way I see is to go with hostname subdomains. This would be an optional feature for extraordinary apps, working only with the infrastructure targets that supports it (local Docker doesn't even know about hostnames). There would be 2 steps required:
- Ask your Kubernetes operator to create a subdomain for you. Infrastructure targets would be responsible for subdomain issuing. This would be a long step, that has to be done outside of Racetrack.
- Claim hostname in a job manifest - separate field which instructs Racetrack to route a job by this hostname if it occurs.