task
task copied to clipboard
Docker Proof-of-Concept
NOTE: This is incomplete work. It's a proof-of-concept that is going well, but the actual behavior and schema will change until it reaches master
, so avoid using this unless for testing purposes.
Feedback is welcome on how you expect this to work 🙂.
Implementation
Credits given to @no0dles for the first PoC on #285 and its our tool hammerkit.
On his implementation, @no0dles used the official Docker client library for Go. Sounds reasonable, but I decided to try a different approach: I'm calling docker run ...
directly. There's a couple of reasons this may be interesting:
- It requires no additional dependencies to the tool.
- It allow us to add a
flags
prop on the YAML to allow the user to set specific Docker flags (flags: [--cpus=1, --pull=always]
). This would be harder to implement if using the Go client, would potentially require a prop to each setting. List of flags available here. - The idea of avoiding using libraries make the implementation a bit more generic, opening room to potentially adding support to other container systems in the future, like Podman, without too much additional work.
TODO
- [ ] Add props to highly common settings.
- Mounting directories?
- Exposing ports?
- We don't need to do all because with the generic
flags
it's also possible to do this.
- [ ] Think about how to handle the
dir
prop and its relation with themounts
prop. - [ ] What do you about
method/sources/generates
? Try to run them inside the container? Run outside? Forbid usage on when task is a container?- We could maybe forbid usage on the first release and figure it out the best behavior later.
- [ ] Dynamic variables are still being computed outside the container. Is that OK?
- [ ] Make implementation more generic (change to a interface) to allow Podman/others in the future?
- [ ] Add global
container
prop, so it's possible to set all tasks on run in a given container?- Not sure how useful that would be. It's probably better to wait for feedback before doing this.
- [ ] Write tests
- These tests should be behind a Go flag so they won't run by default (but should ideally run on CI).
- [ ] Write documentation
- Usage
- API
- JSON Schema
Will this support docker-compatibles? (nerdctl
, podman
, etc...)
This looks awesome @andreynering !
From my perspective — albeit limited — if an early version would mount a volume of the local path, and allow passing some flags, that would be a great start already.
If you want to allow flags
, it might be interesting to inherit some of the behavior of docker-compose
, which has a YAML format for decoratively running containers: https://docs.docker.com/get-started/08_using_compose/. Some will be overlapping with the existing task
args (e.g. command
), but there's a very long list of configs we'd get for free: https://docs.docker.com/compose/compose-file/#ports. I'm familiar with projects docker-compose
as a bad Taskfile.yaml
.
Will this support docker-compatibles? (
nerdctl
,podman
, etc...)
@endigma I never had any contact with Podman yet. I mentioned it because some people have asked for it on the past.
Is Podman CLIs compatible with Docker? The way I implemented it, Task is just generating a docker run [flags] [image] [command]
command and running it. If we could just replace docker
with podman
, that would be awesome. If we need a different syntax, I still believe it wouldn't be hard to adapt, though.
EDIT: The website says:
Simply put:
alias docker=podman
.
So yeah, seems that it would just work. Needs to be tested, though.
https://podman.io/
Hi @andreynering
Thanks for the credits. Looking forward to see support for docker in taskfile.
May I can give you some advice/recommendations, since I had a lot of findings during the development of hammerkit.
Use mounts for sources, volumes for generates.
- Mounts are slow on mac/windows and require a lot of CPU. It does not perform well on large directories (for example node_modules)
- Volumes perform very well on mac/windows since they stay in the docker vm and are not constantly synced.
- Files for volumes can be imported/exported, but thats an additional step before/after.
Keep in mind about file permissions
- Permissions for Mounts/Volumes on mac/windows are no problem, but on linux
- In linux all operations made in the container, will write their files with the current users pid/gid. Lots of docker images use the root user, so all files exposed by mounts/volumes will have those pid/gid permissions set. Execution user can be changed by cli/api, but the image then may not behave correctly, since permissions in the container are missing.
- My solution was to pass in the current uid/gid to the container and for each mount use a "chown" command to adjust the permission inside of the container, before running the actual task commands.
Github Actions on windows with docker is difficult
- I was able to create github actions that support integration tests on macos/linux. macos is quite slow, but working. For Windows I was not able to create a working stable solution.
I think the approach using the cli can work. The only downside will be getting information back from the cli. Status reporting on longer tasks, needs to read from the stdout or some background task needs to inspect the container.
What you also should consider is, if you want to create multiple "docker run" commands for a task or create one container for each task and the use "docker exec" for the list of commands. The later one will be way faster and saves some resources/prep/post work.
Thanks for the tips @no0dles!
These are issues I didn't think I would face. Good reminder to test on different OSes as well.
My current workflow consist of calling Task from inside Daggers Go-SDK. Maybe Task could leverage that to transparently execute tasks in Docker instead? It already gives you a way to interact with the docker daemon using Go, so why reinvent the wheel here? Or do I missunderstand the goals of this PR?
I'm looking forward for this one to land.
I decided to take a different approach, thoughts? https://github.com/go-task/task/pull/1448
We use Task extensively with containers. Generally the approach we have is:
- set some vars to hold the container image name and version. These might default to values from the environment.
- define a task that:
- has the run command:
docker run -rm ...
- maps the current directory into the container (at the expected place, typically the workdir)
- sets any required ports and environment variables (using task vars to manage defaults)
- and the command that should be run in the container (again using task vars to manage defaults)
- has the run command:
- use additional tasks to manage the additional requirements, such as logging into a container registry, or even building a container on demand. The existing task dependency mechanism work fine here.
Since it works already ... I wonder is it really necessary to support docker?
Workflows around containers can be remarkably complex. When you try to formalise that within Task, then its possible to be "a sometimes useful, sometimes not" feature.
If I would extend the schema, it might be like this:
tasks:
build_minimal: // the minimal case, use the default container runtime (top level taskfile)
container:
image: foo:bar
workdir: /working // it might be possible to extract this from the image (where it is normally set)
cmds:
- go build -v -i main.go
build_extended: // the extented/full case
container:
runtime: docker // explicitly set to override the default.
image: foo:bar
workdir: /working // use the existing dir, but interpret as a map a:b ??
env:
foo: bar // use the existing task env ??
ports:
5000:5000
volumes: {} // additional volumes
args: [] // additional args for the container runtime
cmds:
- go build -v -i main.go
// there are lots of runtimes, and lots of options, so you have to specify them exactly
container
default: docker // the default runtime
runtimes:
docker:
cmd: docker // default runtime command
args: --rm // default args
// other container runtime settings ...
but as it is, it looks like this:
tasks:
build:
cmds:
- docker run -it --rm -v$(pwd):/workdir IMAGE:TAG go build -v -i main.go
Well, there might be something to it. For the minimal case, its a bit easier to work with. Perhaps, the problem is generalised even further ... a generic "runtime", which could be containers (docker, podman, kubernetes), but also things like ssh, terminal or other command interfaces.
The minimal case, presented here, with a container runtime defined elsewhere ... does look like it would be easier to work with.
This is an example taskfile that shows how we use docker. You can see, docker itself is not the bulk of the work ...
Its also possible to define the docker command as a variable, and expand that in certain circumstances (typical use case is running task inside a container, then you don't need the docker command.
CMD: '{{if .CI}}{{.CMD}}{{else}}{{DOCKER}}{{.CMD}}{{end}}'
Also, related to #1568, in that file you can see how we solve the skip problem:
if test -n '{{.FUNCTION_PATCH}}' || test -n '{{.SIGNAL_PATCH}}' ; then
docker run --rm -v $(pwd):/sim {{.IMAGE}}:{{.TAG}} patch-network
--network {{.NETWORK_FILE}}
{{.FUNCTION_PATCH}}
{{.SIGNAL_PATCH}}
{{.REMOVE_UNKNOWN}}
; else echo "patch-network skipped"; fi
... but perhaps more interesting, is how we have to deal with optional arguments. Improving that might also be interesting.
a generic "runtime", which could be containers (docker, podman, kubernetes), but also things like ssh, terminal or other command interfaces
@trulede I lean towards this instead of integrating with specific tools. I think Task works better if the integration with tools is left to the user. Opening the door to integrations with tools is a slippery slope. It wouldn't be long before there is a feature request for an integration with everyone's favourite container tool or ssh client etc and this is a maintenance nightmare. I haven't actually used docker
myself for years now. I work with k8s
and podman
and I don't fancy adding direct support for all of these.
However, we should absolutely make it as easy as possible to integrate with other tools. There is an open issue (#448) that tracks the kind of functionality you're describing. I think this fulfils the spirit of the initial proposal here and I would personally vote to close this in favour of an implementation of that issue.
For example:
version: 3
tasks:
# Basic docker example
docker-stuff:
runtime: "docker exec my-container"
cmds:
- some-command-inside-my-container
- ...
# Slightly more interesting example using a variable in the `runtime` clause
kubernetes-stuff:
vars:
POD_NAME: service-1
runtime: "kubectl exec $(kubectl get pod -o name | grep {{.POD_NAME}})"
cmds:
- some-command-inside-my-pod
- ...
# Slightly more interesting example using a variable in the `runtime` clause
ssh-stuff:
vars:
USER: pete
SERVER: my-server
runtime: "ssh {{.USER}}@{{.SERVER}}"
cmds:
- some-command-on-my-server
- ...
@pd93 Our use of Task has been positive for 4 reasons: dependency management, variable handling, unmodified shell execution and a schema that allows us to dynamically generate (and then execute) Taskfiles.
What I have discovered is that the more complex operation is beyond the understanding of most of our users. To deal with that, I have wrapped most complex operations in a Taskfile. I think that works better than the suggested 'runtime' because it encapsulates all the complexity, rather than splitting it to several places (even in the same file, novice users generally can't visualise the abstractions behind task, and come up with "interesting" outcomes).
That makes be wonder, is there another way to achieve both of these outcomes: 'task wrapping' and 'runtime', and perhaps also wrapping some other concepts to create a session like lifecycle for those commands to run/execute in.
Could we think of it this way: the commands of this task, or even a subset of those commands, should run in that context, the context itself is also a task (i.e. specified by the task name). The context task, itself, runs all its commands to setup the context ... then runs the provided commands ... and then runs its own defer commands to cleanup/close the context. For a long running context (i.e. container/telnet etc.) the mechanism mentioned here (#1082) could be used to keep a context running between commands.
When defining such a context task, it might then be necessary to indicate if such a context should be per invocation, or shared (similar to how 'run' determines how often a task runs), but generally it keeps the existing pattern of use.
tasks:
build:
context: builder
cmds:
- go build -v -i main.go
- go build -v -i bar.go
build_mixed:
cmds:
- touch somefile.go
- cmd: go build -v -i main.go
context: builder_session
- task: some_task
builder:
cmds:
- start_context
- defer: stop_context
builder_session:
cmds:
- start_context
- final: stop_context
Generally I agree. These runtimes, and the usage around them, are typically complex and specalised. A generalised approach would likely be more successful than attempting to support a specific tool.
Closing in favor of #448.
I started this PR as a proof-of-concept, but without being sure if it was a good idea or not. I end up not having enough time to return to this PR, as there was always smaller things that had higher priority.
Considering that #448 would allow a wider range of use-cases, it looks much wiser to focus on that proposal instead of this one.
Thanks everyone for the discussion! 🙂