buildkit
buildkit copied to clipboard
PROPOSAL - Dockerfile Linting
Enhance the Dockerfile frontend to support evaluating a Dockerfile for common issues and violations of best practices. We believe making people aware of these issues as soon as possible will save a lot of time investigating and help them learn how to write better Dockerfiles.
We see many people struggling to create a well-structured Dockerfile that is maintainable and they commonly run into issues because they are not following the best practices for writing Dockerfiles, such as:
- Consistent casing for instructions, stage names, and flags
- The correct use for relative WORKDIR
- Using secrets in ARGs
- Use of undeclared ARG in FROM
- and many more
The Dockerfile frontend is a great place to support evaluating these rules because it understands the structure of the Dockerfile and inherited base images. This update would support evaluating rules during and catching any issues during CI, as well as support dry-run evaluation in the local environment.
This initial proposal would have two components:
- Defining an agreed list of rules and their severity
- Implementing the evaluation logic during a build and as a dry run without fully executing a build.
Here is a draft PR for the proposal
Please add your thoughts on the proposal, including suggestions or additional functionality you would like to see.
Initial Proposed Rules List
This is a list of initial Dockerfile rules.
1. stage_ordering
Stages should be correctly ordered.
2. format_empty_continuation_line_disallowed
Empty continuation line disallowed. Reduces readability and may lead to unexpected behavior.
3. instruction_casing
Instructions should be written in consistent casing. Improves readability
4. flag_casing
Flags should be written in lowercase. Improves readability
5. stage_casing
Stage names should written in lowercase. Improves readability
6. workdir_relative_path
Relative WORKDIR must be preceded by an absolute WORKDIR. A relative WORKDIR can have unexpected results if not first defining an absolute WORKDIR and base image changes.
7. arg_secrets_disallowed
Passing secrets via ARGs is not allowed It’s not recommended to use credentials in arguments because they are visible in the docker history.
8. arg_undeclared
Usage of undeclared ARG in FROM
9. arg_invalid_reference
Invalid ARG reference. All FROM with default ARG values should make valid references.
10. arg_invalid_reference_typo
Invalid ARG reference. Did you mean <ARG>?
Unused arg value passed that looks like a typo of defined arg
11. entrypoint_json_args_required
JSON arguments are required for CMD| ENTRYPOINT unless using the custom shell.
Shell form prevents OS signals from being correctly passed to the executables.
12. cmd_json_args_required
JSON arguments required for CMD
| ENTRYPOINT
, unless using custom shell.
Shell form prevents OS signals from being correctly passed to the executables.
13. multiple_heathcheck_disallowed
Multiple HEALTHCHECK instructions are disallowed. Avoid multiple definitions
14. multiple_cmd_disallowed
Multiple CMD instructions are disallowed. Avoid multiple definitions
15. multiple_entrypoint_disallowed
Multiple ENTRYPOINT instructions are disallowed. Avoid multiple definitions
16. stage_self_link_disallowed
Linking to current stage name is disallowed
17. stage_name_duplicate_disallowed
Duplicate stage name <STAGE NAME>. Stage names should be unique.
18. stage_reserve_name_disallowed
Reserved stage name "scratch," "context.”
19. from_platform_flag_const_disallowed
--platform flag should not be used with a constant value unless the stage name also contains os or arch
20. from_platform_default_disallowed
Unnecessary --platform=$TARGETPLATFORM flag, this is the default behavior.
As pointed out by @tianon https://github.com/docker/cli/pull/2743#issuecomment-2035815299 , we have 2 "deprecated-not-really-deprecated" Dockerfile features in https://github.com/docker/cli/blob/master/docs/deprecated.md that should also get a linter warning.
https://github.com/docker/cli/blob/v26.0.0/docs/deprecated.md#dockerfile-legacy-env-name-value-syntax https://github.com/docker/cli/blob/v26.0.0/docs/deprecated.md#maintainer-in-dockerfile
Encourage using build stages, and using scratch images for build target stages? Ref: #4834 regarding a requirement for designating build target stages within Dockerfile
- arg_invalid_reference_typo
Idea:
Detect if any declared vars (ARG or ENV) are too similarly named that it makes the code more difficult to follow
- cmd_json_args_required
Idea:
JSON arguments should be required for RUN when no variables are used in the RUN command.
Encourage using build stages, and using scratch images for build target stages?
This way too opinionated and app-specific. Nothing wrong in using FROM scratch
but also nothing wrong in using a base image, and in many cases using base image is the right or only possible/correct solution.
JSON arguments should be required for RUN when no variables are used in the RUN command.
Why?
Based on some content in https://github.com/moby/buildkit/issues/4834 , I'd propose a rule that detects if there is a unreachable stage that is not named and not used in any of the --from
flags by index. Not sure if there are many such Dockerfiles out there, but if there are we could detect them.
This way too opinionated and app-specific.
It's an encouragement, so IMHO it would be like an INFO level of importance. Not sure if you've considered having different priorities/severities/levels of findings, but many general purpose linting tools tend to have some sort of system for grouping findings by importance. If that were to be the case here, I'd consider it valid to have it at a low importance that people could easily ignore.
Why?
If a run command is short and takes no variables and doesn't depend on a shell in any way, why require the shell to be called up by Buildkit to run the command in the layer? Simple commands like useradd someuser
where the user is not a variable; those commands don't need to be called by a shell, so why call it through the shell? It's simple enough to change it from RUN useradd someuser
to RUN ["/usr/bin/useradd", "someuser"]
and save a shell invocation. This may also be important if there isn't a shell in the given layer (nor mounted in for use) where the RUN
command is being used.
@colinhemmings
Can you modify this to reflect the rules in v0.14 and create a new issue with any carry-over?
@daghack and/or @colinhemmings to create new issues for rules that were not implemented (follow-on for the https://github.com/moby/buildkit/issues/4823#issuecomment-2152891007 comment)