kestra
kestra copied to clipboard
Every script execution fails with DockerTaskRunner on Windows
Describe the issue
I try to test kestra and every single script invocation fails. It seems the script itself is not accessible from the docker runner container. Please see the
logs
The same problem occurs with scripts that access namespace files.
the flow in question
id: "script_in_venv"
namespace: "myteam"
tasks:
- id: bash
type: io.kestra.plugin.scripts.python.Commands
inputFiles:
main.py: |
import requests
from kestra import Kestra
response = requests.get('https://google.com')
print(response.status_code)
Kestra.outputs({'status': response.status_code, 'text': response.text})
beforeCommands:
- python -m venv venv
- . venv/bin/activate
- pip install requests kestra > /dev/null
commands:
- python main.py
docker-compose.yml
volumes:
postgres-data:
driver: local
kestra-data:
driver: local
services:
postgres:
image: postgres
volumes:
- postgres-data:/var/lib/postgresql/data
environment:
POSTGRES_DB: kestra
POSTGRES_USER: kestra
POSTGRES_PASSWORD: k3str4
healthcheck:
test: ["CMD-SHELL", "pg_isready -d $${POSTGRES_DB} -U $${POSTGRES_USER}"]
interval: 30s
timeout: 10s
retries: 10
kestra:
image: kestra/kestra:v0.16.1-full
pull_policy: always
# Note that this is meant for development only. Refer to the documentation for production deployments of Kestra which runs without a root user.
user: "root"
command: server standalone --worker-thread=128
volumes:
- kestra-data:/app/storage
- /var/run/docker.sock:/var/run/docker.sock
- /tmp/kestra-wd:/tmp/kestra-wd
environment:
KESTRA_CONFIGURATION: |
datasources:
postgres:
url: jdbc:postgresql://postgres:5432/kestra
driverClassName: org.postgresql.Driver
username: kestra
password: k3str4
kestra:
plugins:
configurations:
- type: io.kestra.plugin.scripts.runner.docker.DockerTaskRunner
values:
volume-enabled: true
server:
basic-auth:
enabled: false
username: "[email protected]" # it must be a valid email address
password: kestra
repository:
type: postgres
storage:
type: local
local:
base-path: "/app/storage"
queue:
type: postgres
tasks:
tmp-dir:
path: /tmp/kestra-wd/tmp
url: http://localhost:8080/
ports:
- "8080:8080"
- "8081:8081"
depends_on:
postgres:
condition: service_started
Environment
- Kestra Version: v0.16.1-full
- Operating System (OS/Docker/Kubernetes): Docker on WSL 2
- Java Version (if you don't run kestra in Docker):
interesting, I couldn't reproduce on the latest version:
Can you try using kestra/kestra:latest-full
:
docker run --pull=always --rm -it -p 28080:8080 --user=root -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp kestra/kestra:latest-full server local
It could be some Docker issue. On Windows, it might be worth trying using Docker Desktop instead of WSL
Hi @anna-geller , thanks. I used the latest version with same results. Then I tried 0.16.1, then the dev-snapshot, every time with the same issue. Btw. I use docker-desktop with WSL2 and docker compose. So I will try your direct run command.
Sorry, but starting kestra without custom docker-compose config and
docker run --pull=always --rm -it -p 28080:8080 --user=root -v /var/run/docker.sock:/var/run/docker.sock -v /tmp:/tmp kestra/kestra:latest-full server local
and using the test flow from above offers the same failing result.
I tried again and removed the /tmp mount completely, thinking it was a permissions issue. Without success. Now I'm going to deploy Kestra in a live environment and see if I can do it.
thx so much for reporting, my colleague could reproduce using the same setup as you did
so you didn't do anything wrong. We'll look at it in more detail and keep you updated on the issue
@paulgrainger85 I think it's not WSL related since I deployed with the above settings in our companys docker infra and had the same issues. All works well but script flows.
@gitmonster, on your infra is linux based? you can share how it's deployed? (docker compose or anything else)
Hey @tchiotludo, yes it's
- linux based on ubuntu-server:20.04
- swarm environment
- deployed with docker compose
- kestra version is pinned to 0.16.1
here is the config
services:
####################################################
postgres:
image: postgres:16.2
deploy:
mode: replicated
replicas: 1
placement:
constraints: [ node.labels.name == ${HOST_NAME} ]
volumes:
- postgres-data:/var/lib/postgresql/data
networks:
- net
environment:
POSTGRES_DB: ${KESTRA_POSTGRES_DB}
POSTGRES_USER: ${KESTRA_POSTGRES_USER}
POSTGRES_PASSWORD: ${KESTRA_POSTGRES_PASSWORD}
healthcheck:
test: ["CMD-SHELL", "pg_isready -d $${KESTRA_POSTGRES_DB} -U $${KESTRA_POSTGRES_USER}"]
interval: 30s
timeout: 10s
retries: 10
####################################################
kestra:
image: ${IMAGE_NAME_KESTRA_WITH_CONFIG}
deploy:
mode: replicated
replicas: 1
placement:
constraints: [ node.labels.name == ${HOST_NAME} ]
labels:
- traefik.enable=true
- traefik.docker.network=public
- traefik.http.routers.kestra.rule=Host(`${KESTRA_DOMAIN}`)
- traefik.http.routers.kestra.entrypoints=websecure
- traefik.http.routers.kestra.tls.certresolver=le
- traefik.http.routers.kestra.service=kestra
- traefik.http.services.kestra.loadbalancer.server.port=${KESTRA_PORT}
- traefik.http.routers.kestra.middlewares=authelia@docker
- traefik.constraint-label=traefik-public
# Note that this is meant for development only. Refer to the documentation for production deployments of Kestra which runs without a root user.
user: "root"
command: server standalone --worker-thread=128
networks:
- public
- base_net
- net
volumes:
- kestra-data:/app/storage
- /var/run/docker.sock:/var/run/docker.sock
- /tmp:/tmp/kestra-wd
environment:
KESTRA_CONFIGURATION: |
datasources:
postgres:
url: jdbc:postgresql://postgres:5432/${KESTRA_POSTGRES_DB}
driverClassName: org.postgresql.Driver
username: ${KESTRA_POSTGRES_USER}
password: ${KESTRA_POSTGRES_PASSWORD}
kestra:
repository:
type: postgres
storage:
type: local
local:
base-path: "/app/storage"
queue:
type: postgres
url: ${KESTRA_PUBLIC_URL}
####################################################
networks:
base_net:
external: true
public:
external: true
net:
driver: overlay
####################################################
volumes:
postgres-data:
driver: local
kestra-data:
driver: local
@gitmonster : swarm in multiple node?
yes, but as you can see, all kestra related deployment is constrained to a single host.
So it's not only on Windows as I was told.
Can you try with something very simple like a Shell task that try to write in a file?
id: old-shell
namespace: myteam
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
commands:
- echo "Hello" > hello.txt
- cat hello.txt
No, It's not windows-only related. As you can see I started out on ubuntu:20.04/windows/WSL2 then deployed to a pure linux environment. This is working:
And if you call pwd
the working directory is correctly created?
id: old-shell
namespace: myteam
tasks:
- id: shell
type: io.kestra.plugin.scripts.shell.Commands
commands:
- pwd
yes.
I tried on Docker Swarm and it works, it should be some permission issue.
I notice that you mount /tmp:/tmp/kestra-wd
mounting /tmp
can be a bit blunt as there can be a lot of things trying to write on it.
Can you mount /tmp/kestra-wd:/tmp/kestra-wd
instead and create the dir prior to launching the stack to be sure it exists.
@loicmathieu I tried with /tmp/kestra-wd:/tmp/kestra-wd
settings yesterday but couldn't deploy the entire service cause the /tmp/kestra-wd folder could not be created on the host.
Now I redeployed with /tmp/kestra-wd:/tmp/kestra-wd
successfully but the
id: "script_in_venv"
namespace: "myteam"
tasks:
- id: bash
type: io.kestra.plugin.scripts.python.Commands
inputFiles:
main.py: |
import requests
from kestra import Kestra
response = requests.get('https://google.com')
print(response.status_code)
Kestra.outputs({'status': response.status_code, 'text': response.text})
beforeCommands:
- python -m venv venv
- . venv/bin/activate
- pip install requests kestra > /dev/null
commands:
- python main.py
flow gives the same failing result:
with /tmp/kestra-wd existing on the host:
The /tmp/5aAT9X6DiSJDb26ugbZQ1v
folder from the last invocation exists on the host, but /tmp/5aAT9X6DiSJDb26ugbZQ1v/main.py
not:
The same here. It runs nicely but if you change the runner from the second task to DOCKER, the test.go file is missing in the workspace folder:
id: golangtest
namespace: myteam
description: Test golang scripts
inputs:
- id: greeting
type: STRING
defaults: "kestra.io from inputs"
tasks:
- id: retrieve_go_version
type: io.kestra.plugin.scripts.shell.Commands
docker:
image: golang:alpine3.19
runner: DOCKER
commands:
- go version
- id: test_go_script_simple
type: io.kestra.plugin.scripts.shell.Commands
docker:
image: golang:alpine3.19
runner: PROCESS
warningOnStdErr: false
inputFiles:
test.go: |
package main
import(
"fmt"
"github.com/fatih/color"
)
func main(){
fmt.Printf("hello %s\n", "{{ inputs.greeting }}")
color.Blue("Hey %s from golang", "kestra")
}
beforeCommands:
- go mod init github.com/kestra/test
- go mod tidy
commands:
- pwd
- ls -la .
- go run test.go
runner: PROCESS
runner: DOCKER
@gitmonster thanks for the additional feedback, the issue seems to be that the Docker runner is unable to create the input files in the temporary folder. Now that we narrow that down I need to reproduce it using a Windows box to see if we can fix that.
@loicmathieu all the tests I did lastly were not on windows but on ubuntu:20.04 server -> this error is not windows related, so the title you gave to this issue is somewhat missleading.
@gitmonster I understood that, but unless you can give me the steps to setup the same env you use (with Docker Swarm if I understand it), the only easy way to reproduce is to use a Windows box. We cannot reproduce it in any of our Linux environments and I cannot reproduce it locally with Docker Swarm so this must be linked to your exact setup.
@loicmathieu understand, this is the config I used https://github.com/kestra-io/kestra/issues/3590#issuecomment-2071712327 I will try a fresh local install on another machine with this config
With no luck. Started the default docker-compose.yml from the repo with a simple docker compose up
and tested the flow. I'll give up on this.
Here my docker version:
@gitmonster on which OS exactly?
You originally reported it on Windows with WSL, and we are able to confirm it didn't work on Windows,.
If it also occurs on a Linux machine, please run uname -a
so we know the exact distribution and version it didn't work.
No, currently it's another win10/wsl2 machine:
Linux 5.15.146.1-microsoft-standard-WSL2 #1 SMP Thu Jan 11 04:09:03 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
But the same happens to me on a pure ubuntu:20.04:
Linux 5.4.0-166-generic #183-Ubuntu SMP Mon Oct 2 11:28:33 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Linux manager01 5.15.0-102-generic #112-Ubuntu SMP Tue Mar 5 16:50:32 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Same issue with docker swarm.
version: '3.8'
x-default-opts:
&default-opts
logging:
options:
max-size: "10m"
networks:
traefik-proxy:
external: true
services:
app:
image: kestra/kestra:latest-full
# Note that this is meant for development only. Refer to the documentation for production deployments of Kestra which runs without a root user.
networks:
- traefik-proxy
user: "root"
command: server standalone --worker-thread=128
volumes:
- /mnt/storage-pool/appdata/kestra/kestra-data:/app/storage
- /var/run/docker.sock:/var/run/docker.sock
- /mnt/storage-pool/appdata/kestra/tmp/kestra-wd:/tmp/kestra-wd
environment:
KESTRA_CONFIGURATION: |
datasources:
postgres:
url: jdbc:postgresql://data01.xxxx.net/kestra_db
driverClassName: org.postgresql.Driver
username: kestradb_user
password: xxxxxxx
kestra:
repository:
type: postgres
storage:
type: local
local:
base-path: "/app/storage"
queue:
type: postgres
tasks:
tmp-dir:
path: /tmp/kestra-wd/tmp
url: https://kestra.xxxxxx.net/
deploy:
mode: replicated
replicas: 1
placement:
constraints:
- node.role == worker
# Container resources (replace with yours)
resources:
limits:
cpus: '1.55'
memory: 2G
reservations:
cpus: '1.35'
memory: 512M
labels:
- "traefik.enable=true"
- "traefik.http.routers.kestra.rule=Host(`kestra.xxxx.net`)"
- "traefik.http.routers.kestra.service=kestra"
- "traefik.http.routers.kestra.entrypoints=https"
- "traefik.http.services.kestra.loadbalancer.server.port=8080"
- "traefik.http.routers.kestra.tls=true"
#- "traefik.http.services.kestra.loadbalancer.passhostheader=true"
- "traefik.http.routers.kestra.middlewares=chain-authentik@file"
Hi @gitmonster
Can you try using a different image than the default one:
id: "script_in_venv"
namespace: "myteam"
tasks:
- id: bash
type: io.kestra.plugin.scripts.python.Commands
docker:
image: python
inputFiles:
main.py: |
import requests
from kestra import Kestra
response = requests.get('https://google.com')
print(response.status_code)
Kestra.outputs({'status': response.status_code, 'text': response.text})
beforeCommands:
- python -m venv venv
- . venv/bin/activate
- pip install requests kestra > /dev/null
commands:
- python main.py
@loicmathieu here the result:
Can you try:
services:
app:
# ...
volumes:
- /mnt/storage-pool/appdata/kestra/kestra-data:/app/storage
- /var/run/docker.sock:/var/run/docker.sock
- /mnt/storage-pool/appdata/kestra/tmp/kestra-wd:/mnt/storage-pool/appdata/kestra/tmp/kestra-wd
environment:
KESTRA_CONFIGURATION: |
datasources:
# ...
kestra:
# ...
tasks:
tmp-dir:
path: /mnt/storage-pool/appdata/kestra/tmp/kestra-wd/tmp
url: https://kestra.xxxxxx.net/
deploy: