allow customizing success criteria for HTTP service checks
In Nomad service discovery checks, we consider HTTP health checks successful if they return 200OK or any other status code below 400. We also ignore anything in the body. In https://github.com/hashicorp/nomad/issues/26900#issuecomment-3384301823 is was suggested that it might be useful for operators to accept a different set of codes as success, and that seems safe to allow job authors to do. This would only impact Nomad service discovery; Consul would have to be implemented independently by Consul and then we could add those configuration values to the jobspec.
It was also suggested that Nomad could read the HTTP response body (because there are terrible APIs that do thing like return 200 OK with a JSON body {"message": "everything is on fire, thanks"}). This feels much more risky as now we're making the Nomad client parse arbitrary response bodies. So we probably wouldn't want to try this kind of thing.
BTW: body parsing would be intentional by operator and not enabled by default
BTW: body parsing would be intentional by operator and not enabled by default
Under the Nomad security model there are typically at least two personas at play here: the Nomad Operator (who we usually refer to as Job Author) and the Nomad Administrator. The Nomad cluster admin would need to be in control of that configuration, and it would need to happen on the server rather than the client so that we're not placing workloads on clients that can't support it. I'm having trouble imagining a cluster admin setting this option to be enabled cluster-wide unless it was scoped by namespace (outside of single-user use cases), and that starts to stray into the Enterprise governance features. So there's a good bit of feature creep involved in that.
I would enable it on my cluster :D anyhow, there are sometimes apps which return 200 as healthy but then there's specific message returned if the app is initialized or not (initial setup). And in my mind app is unhealthy until you can fully utilize it (200 health check and initialized). Dunno, just think that it would be good addition to Nomad.
Also scripts for healthchecks too would be needed (currently only supported by Consul for some reason).