The VOLUME instruction documentation is inaccuarte/incomplete
Is this a docs issue?
- [x] My issue is about the documentation content or website
Type of issue
Information is incorrect
Description
The documentation on the VOLUME command indicates that it only takes one argument. However, almost all usages I've seen show it taking two arguments. The documentation needs to show this. Furthermore, the linked documentation to https://docs.docker.com/engine/storage/volumes/ only shows command-line or compose syntax. I cannot find where it maps this syntax to the Dockerfile VOLUME syntax.
Location
https://docs.docker.com/reference/dockerfile/
Suggestion
The documentation needs to show the typical usage of VOLUME with at least two arguments (a mount point and a target). If there is a mapping to the --mount command line argument, this needs to be explicit.
I cannot find where it maps this syntax to the Dockerfile
VOLUMEsyntax.
It doesn't?
VOLUME instruction defines a path in the container to attach an anonymous volume.
I've never seen bind mounts defined in a Dockerfile for VOLUME, only when used other instructions that have mount options (only used during image build). Could you please share a reference link to where you're seeing this usage you're speaking of?
Many official containers are known to have VOLUME used but it's mostly legacy in purpose and often causes various drawbacks. You'll find on official images like for several databases I've raised issues (along with this central one) as I'm quite against usage of VOLUME and see it as an anti-pattern for images. Containers should opt-in explicitly to persistence given the caveats that previous link cites.
Perhaps you've had a misunderstanding of the feature? You do not need to declare VOLUME to attach volumes to a container. You can choose from anonymous, named, and bind mount volumes to use with your containers. These are configured via CLI or compose.yaml, there is no relevance/association to --volume or --mount CLI options, but you can provide an explicit mount over the same path declared by VOLUME, and your explicit mount will take precedence.
What you appear to be expecting is that an image could define a bind mount path from the container to an explicit location on the host filesystem? That would be a security problem. Users run containers to have isolation, not to have containers mess unexpectedly with the host filesystem.
What does happen though is when a new container instance is created, VOLUME creates a new anonymous volume and copies any data internally at that container mount path to the anonymous volume storage on the host system. With the exception of Docker Compose which has extra logic to preserve the same anonymous volume (bound to the service rather than a specific image/container), which can be rather surprising to troubleshoot an image otherwise assumed to be immutable.
What you appear to be expecting is that an image could define a bind mount path from the container to an explicit location on the host filesystem? That would be a security problem. Users run containers to have isolation, not to have containers mess unexpectedly with the host filesystem.
Yes, I think I've completely misunderstood containers. They were sold to me as lightweight disposable virtual machines that run off the host's kernel, designed to contain programs and configurations in isolation from the host operating system. It was sold to me that I could program in how I wanted to configure them, so I wouldn't have to re-run the same configuration scripts each time I started a new container (in contrast to, say, installing a full virtual machine and have to run through the entire installation and configuration each time I made a new one). Even better still, the build process would be cached and reused so, if I modified a line in my configuration, it would only rebuild sections of my image that my change affected.
Specifically, I'm trying to set up a WordPress development environment without installing and configuring Apache, MariaDB, and WordPress on my host system. I thought I could set up a container to run when I want to work on WordPress and shut it off when I don't. I also thought I could configure a container to mount and share a folder from the host so, when I was ready to deploy my WordPress site, I could copy the wp-content folder from my host to a production server.
However, I'm gathering from your response and the volumes of other information I've read that I'm using the wrong tool for the job. Hacks I've found to try to coerce containers into making such a development environment, which prompted my question, use -v "$PWD/html":/var/www/html switches to mount local folders into the container.
If I understand you correctly, you're saying that's both poor practice and unsupported in the VOLUME instruction. If so, I suggest clarifying the documentation on VOLUME to make it clear that there's no analogue to the -v command-line switch, so newcomers like me aren't wasting hours trying to figure out how to create that functionality in a Dockerfile.
However, I'm gathering from your response and the volumes of other information I've read that I'm using the wrong tool for the job.
I don't think so.
- An image can provide the bulk of the software and config you want to run as a container.
- There is a boundary to the host such as volumes for persistence, these are distinct from the container instanced from an image. Such configuration is best deferred to a tool like Docker Compose which uses YAML to specify which image to use, volumes to attach, ports to publish (and to which networks on the system, defaults to all), etc.
The latter is container agnostic, whereas the image encodes everything else specific to the image. You'll still need to bridge that configuration to the host system when you as a user want to persist data at a specific location on that host system (otherwise use named or anonymous volumes), and the port mapping (since more than one image might run a web service on port 80, and you can only have service at a time listening on port 80, such flexibility is needed). Consider the image agnostic/generic for sharing, whereas the other container config matters at runtime and varies by users needs which should not matter to the image to run correctly.
If I understand you correctly, you're saying that's both poor practice and unsupported in the
VOLUMEinstruction.
As I've said, I'm not aware of anyone using VOLUME to map a container path to a host filesystem path. You've said you've seen such numerous times but are unable to provide any references to such?
I've not attempted to do such myself, but to my knowledge such syntax does not exist for VOLUME.
You create a container to run with Docker CLI / Docker Compose and that volume mapping is provided via CLI or compose.yaml.
If so, I suggest clarifying the documentation on
VOLUMEto make it clear that there's no analogue to the-vcommand-line switch, so newcomers like me aren't wasting hours trying to figure out how to create that functionality in aDockerfile.
I'm a tad confused. You want explicit documentation about the absence of support, rather than inference that there is no support for such when there is no explicit documentation claiming there to be? That would not be very practical.
I'd first question how you came about the misunderstanding first, perhaps you were engaging with an AI tool that suggested such? It's not uncommon for those to hallucinate and confidently insist that invalid syntax is valid.
Generally when someone is new to containers I direct them to using images and Docker Compose, not to bother with custom images via Dockerfile until they're familiar and comfortable with the basics. For example using the official NodeJS or PHP images as-is with volume mounts to run software, that can work perfectly fine for development.
Custom images are not always necessary, they're just helpful for encapsulating a full artifact for deployment but you can just as well have a common image that multiple containers are created from with different volume mounts for persisting their state and source from a git clone.
They were sold to me as lightweight disposable virtual machines that run off the host's kernel, designed to contain programs and configurations in isolation from the host operating system.
It was sold to me that I could program in how I wanted to configure them, so I wouldn't have to re-run the same configuration scripts each time I started a new container (in contrast to, say, installing a full virtual machine and have to run through the entire installation and configuration each time I made a new one).
That's basically what you do get.
You could achieve the same with a VM, I often used XFS as my host filesystem for the reflinks feature, which allowed me to copy VM disk images cheaply, I'd get a full copy with only new modifications written to disk. With Ventoy I was able to boot into those on bare metal too from a decent quality USB SSD.
The convenience with Docker is similar to a package manager, someone else can do the packaging for you and if needed you can extend it from there with your own changes.
Even better still, the build process would be cached and reused so, if I modified a line in my configuration, it would only rebuild sections of my image that my change affected.
Not entirely.
A Dockerfile does have a layer cache, but if a layer becomes invalidated, it invalidates subsequent layers for that stage (assuming multi-stage). It is not smart to know about if your change would not affect subsequent layers (such as via RUN), so if those layers were invalidated despite what may seem unchanged/unaffected to you, the cache isn't reusable.
A workaround for that in Dockerfile is to leverage cache mounts explicitly such as via RUN --mount ..., in CI systems this is not part of the layer cache so you may find such cache is lost across runs. A cache mount can persist across layer invalidation/builds, or even be shared across multiple separate images to build. It's usefulness will depend on what you're doing in the Dockerfile, it is effectively a temporary mounted volume, there's no special Docker logic involved in it's caching ability you instead use it like you would on a host system pointing whatever commands involved would use cacheable directories of their own, which may also represent content you'd not want to persist into the final image, only relevant during the build process.