rules_docker
rules_docker copied to clipboard
Add the ability to add healthcheck definition for containers
Original issue: https://github.com/bazelbuild/rules_docker/issues/770
Adds follow options for container_image:
- healthcheck_test
- healthcheck_interval
- healthcheck_timeout
- healthcheck_start_period
- healthcheck_retries
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).
:memo: Please visit https://cla.developers.google.com/ to sign.
Once you've signed (or fixed any issues), please reply here with @googlebot I signed it!
and we'll verify it.
What to do if you already signed the CLA
Individual signers
- It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.
Corporate signers
- Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
- The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
- The email used to register you as an authorized contributor must also be attached to your GitHub account.
ℹ️ Googlers: Go here for more info.
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: svagner
To complete the pull request process, please assign davegay after the PR has been reviewed.
You can assign the PR to them by writing /assign @davegay
in a comment when ready.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve
in a comment
Approvers can cancel approval by writing /approve cancel
in a comment
@googlebot I signed it!
cc @xingao267 , @alex1545 as you were in the discussion in the original issue
Hiding whitespace changes definitely makes this diff easier to review: https://github.com/bazelbuild/rules_docker/pull/1742/files?w=1
I don't see a reason why this couldn't be merged. The only concern I have is that we're overloading healthcheck_test
to do multiple types of tests (CMD
, CMD-SHELL
, NONE
, etc). Ideally I'd gravitate towards separating these out but I'm not sure that would make sense in this case due to this being a docker-specific rule.
I'm leaning towards thinking this is fine to include but I'd like the opinion of another maintainer just as a sanity check.
I think it's cool that you put this together, but I'm a little wary of this as-is for a couple of reasons:
- It's a bit telling that I've never heard of this feature before today. As far as I can tell, it's not supported in kubernetes but was a feature added on by Docker Inc for swarm specifically. I don't think this feature is going to get much mileage, and it comes at a time when we're trying to simplify this repo rather than add complexity.
- Given a
Dockerfile
only has theHEALTHCHECK
command, why doesrules_docker
now need 5 new options to say the same thing? If there is no parsing library ingoogle/go-containerregistry
(upon which this repo depends) to parse that into the necessary config, then that library/function should probably there upstream before merging this. I don't think we should need more than a single newstring
orstring_list
attributehealthcheck
for this.
I don't see a reason why this couldn't be merged. The only concern I have is that we're overloading
healthcheck_test
to do multiple types of tests (CMD
,CMD-SHELL
,NONE
, etc). Ideally I'd gravitate towards separating these out but I'm not sure that would make sense in this case due to this being a docker-specific rule.I'm leaning towards thinking this is fine to include but I'd like the opinion of another maintainer just as a sanity check.
Yeah. I also thought that probably it's better to have healthcheck_cmd
and healthcheck_cmd_shell
options. But it will lead to more complications:(
I think it's cool that you put this together, but I'm a little wary of this as-is for a couple of reasons:
- It's a bit telling that I've never heard of this feature before today. As far as I can tell, it's not supported in kubernetes but was a feature added on by Docker Inc for swarm specifically. I don't think this feature is going to get much mileage, and it comes at a time when we're trying to simplify this repo rather than add complexity.
- Given a
Dockerfile
only has theHEALTHCHECK
command, why doesrules_docker
now need 5 new options to say the same thing? If there is no parsing library ingoogle/go-containerregistry
(upon which this repo depends) to parse that into the necessary config, then that library/function should probably there upstream before merging this. I don't think we should need more than a single newstring
orstring_list
attributehealthcheck
for this.
Thanks for the feedback. Let me explain my thoughts about it.
- These rules are called
rules_docker
so I believe it means that we should support at least all the main features of docker. I agree that kube doesn't use docker'sHEALTHCHECK
since it has it's ownlivenessProbe
but this plugin isn't dedicated to kube only is it? - I think having
healthcheck_test =" --interval = 5s --timeout = 10s --retries = 3 CMD curl -sS 127.0.0.1 "" will be a little bit confusing for users of these rules since interval / to / retries are optional parameters for health checking right? And unfortunately I don't follow your point about the parsing of this parameter in the
google / go-containerregistry`. Since it is a single directive there is not necessary to parse it. But I'm again worrying about moving complications of the definition for health checks from the rules code to users.
Let me know if my arguments make sense to you.
I don't see how --interval = 5s --timeout = 10s --retries = 3 CMD curl -sS 127.0.0.1
could possibly be confusing to any user wanting to use this, because that's the syntax Docker came up with and has clearly documented.
And it is definitely necessary to parse it, because the configuration for it is not a simple string. It is https://github.com/moby/moby/blob/46cdcd206c56172b95ba5c77b827a722dab426c5/api/types/container/config.go#L16-L35:
// HealthConfig holds configuration settings for the HEALTHCHECK feature.
type HealthConfig struct {
// Test is the test to perform to check that the container is healthy.
// An empty slice means to inherit the default.
// The options are:
// {} : inherit healthcheck
// {"NONE"} : disable healthcheck
// {"CMD", args...} : exec arguments directly
// {"CMD-SHELL", command} : run command with system's default shell
Test []string `json:",omitempty"`
// Zero means to inherit. Durations are expressed as integer nanoseconds.
Interval time.Duration `json:",omitempty"` // Interval is the time to wait between checks.
Timeout time.Duration `json:",omitempty"` // Timeout is the time to wait before considering the check to have hung.
StartPeriod time.Duration `json:",omitempty"` // The start period for the container to initialize before the retries starts to count down.
// Retries is the number of consecutive failures needed to consider a container as unhealthy.
// Zero means inherit.
Retries int `json:",omitempty"`
}
Part of the parser is here: https://github.com/moby/buildkit/blob/master/frontend/dockerfile/parser/line_parsers.go#L343-L369
But rules_docker has no dependency on moby library, it has a dependency on go-containerregistry, which has the same copy of the config: https://github.com/moby/buildkit/blob/master/frontend/dockerfile/parser/line_parsers.go#L343-L369.
Yes, as far as I know, there is no parsing logic in go-containerregistry to transform a string into that config. The parsing logic should be upstreamed there, in the same place that has the config. Once that logic is there, we could call parseHealthcheck
here and get the correct config.
Also, rules_docker should not strive to replicate every feature in docker or a Dockerfile. A noteable absence is the RUN
feature.
Ah, you meant about an unmarshaling line to the struct. Yeah. Makes sense. I'm not sure go-containerregistry
will approve this change though since it looks like they don't need such a directive there. I can try to prepare PR for go-containerregistry
for it but are we ready to add a health check directive here? I really just don't want to waste reviewers of go-containerregistry
time and my time if we don't want this feature at all.
Also, rules_docker should not strive to replicate every feature in docker or a Dockerfile. A noteable absence is the RUN feature.
If we can get a parsing function upstreamed and simplify this PR to add only a single healthcheck
attribute, I would support it.
@svagner will you continue this PR?
Hey @jonjohnsonjr what do you think about the proposed upstream change to go-containerregistry?
what do you think about the proposed upstream change to go-containerregistry?
That sounds reasonable to me, something like func ParseHealthConfig(s string) (*HealthConfig, error)
?
I'm happy to review a PR.
Any progress on this? Would be great to get this added!! If you want to use ecs with out elb/alb for internal services with service discovery the way they support health checks is through docker health check. Without this it becomes an issue.
@dgocoder I'm not sure if @svagner is still around to finish this work (by upstreaming the Go parts) or if it needs a new contributor to step up to finish.
hi, i'm going to take a crack at picking this up and upstreaming said changes. posting to shame myself into followthrough.
This Pull Request has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!
hey @pjjw, any luck? We would really love to see this merged. currently we need to have a base image created in a different build to include the healthcheck, quite far from the application definition... :/
no, have not plumbed this upstream yet, but this diff works for us as-is.
This Pull Request has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days. Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_docker!
This PR was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"