wave icon indicating copy to clipboard operation
wave copied to clipboard

Replace EFS with S3

Open munishchouhan opened this issue 1 year ago • 20 comments

This PR will Replace usgae of file system with S3

munishchouhan avatar Jul 24 '24 23:07 munishchouhan

getting this error, while running new buildkit image on k8s

12:08PM WRN Error running in a new user namespace - fork/exec /usr/bin/fusion: invalid argument

12:08PM WRN cannot apply Nextflow profile :: unknown remote store for '' work prefix
/usr/bin/fusermount3: fuse device not found, try 'modprobe fuse' first
12:08PM FTL mount.go:251 > mounting filesystem error="fusermount exited with code 256\n"

munishchouhan avatar Aug 23 '24 12:08 munishchouhan

@fntlnz is the guru

pditommaso avatar Aug 23 '24 12:08 pditommaso

@munishchouhan something is off accessing the /dev/fuse device. How are you running this?

fntlnz avatar Aug 23 '24 12:08 fntlnz

Found, you need to add

--device /dev/fuse

https://github.com/seqeralabs/wave/blob/7751c8653b5daf77c149c1f5eae1d02d817278b9/src/main/groovy/io/seqera/wave/service/builder/DockerBuildStrategy.groovy#L105

fntlnz avatar Aug 23 '24 12:08 fntlnz

If this needs to be done in Kubernetes there are two possible solutions:

  • Start the container privileged and mount /dev/fuse inside of it
  • Use a device plugin, the one Nextflow uses is https://github.com/nextflow-io/k8s-fuse-plugin -> https://github.com/nextflow-io/nextflow/pull/4612

fntlnz avatar Aug 23 '24 12:08 fntlnz

Better using the fuse plugin

pditommaso avatar Aug 23 '24 13:08 pditommaso

I have started the Daemonset

(base) munish.chouhan@Munishs-MacBook-Pro ~ % kubectl get DaemonSets -n kube-system
NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
fuse-device-plugin-daemonset   1         1         1       1            1           <none>          18m

Add added limit as per https://github.com/nextflow-io/k8s-fuse-plugin/blob/master/README.md#usage in here https://github.com/seqeralabs/wave/blob/88738ee06239d22ed17c3c7647d2fe6932f3b220/src/main/groovy/io/seqera/wave/service/k8s/K8sServiceImpl.groovy#L283

But still getting the same error

2:12PM WRN Error running in a new user namespace - fork/exec /usr/bin/fusion: invalid argument

2:12PM WRN cannot apply Nextflow profile :: unknown remote store for '' work prefix
/usr/bin/fusermount3: mount failed: Operation not permitted
2:12PM FTL mount.go:251 > mounting filesystem error="fusermount exited with code 256\n"

cc @jordeu

munishchouhan avatar Aug 23 '24 14:08 munishchouhan

From the invalid argument message I suspect that it's more related to an invalid Fusion command invocation. What is the full command line that it's executed?

jordeu avatar Aug 27 '24 07:08 jordeu

or, can you share an example image that will be executed?

jordeu avatar Aug 27 '24 07:08 jordeu

or, can you share an example image that will be executed?

I am using this example to test using wave-cli wave --conda-package bwa --wave-endpoint http://localhost:9090

munishchouhan avatar Aug 27 '24 07:08 munishchouhan

From the invalid argument message I suspect that it's more related to an invalid Fusion command invocation. What is the full command line that it's executed?

same command is working with docker, but i will debug further to check this

munishchouhan avatar Aug 27 '24 07:08 munishchouhan

From the invalid argument message I suspect that it's more related to an invalid Fusion command invocation. What is the full command line that it's executed?

same command is working with docker, but i will debug further to check this

But with docker, there is no need to clone the namespace, so Fusion behaves differently. If you can get the full command line that is executed I can check if there is something strange.

jordeu avatar Aug 27 '24 07:08 jordeu

From the invalid argument message I suspect that it's more related to an invalid Fusion command invocation. What is the full command line that it's executed?

same command is working with docker, but i will debug further to check this

But with docker, there is no need to clone the namespace, so Fusion behaves differently. If you can get the full command line that is executed I can check if there is something strange.

do you want command line for docker or k8s?

munishchouhan avatar Aug 27 '24 07:08 munishchouhan

do you want command line for docker or k8s?

It's better if it's the k8s, but they should be the same, so if it's easier for you, send me the docker one.

jordeu avatar Aug 27 '24 08:08 jordeu

Also, the environment variables are important. Send me the docker inspect and we can assume that it's the same on k8s.

jordeu avatar Aug 27 '24 08:08 jordeu

It's better if it's the k8s, but they should be the same, so if it's easier for you, send me the docker one.

here is the docker command, I will work on getting you the k8s info

docker run --rm --privileged \
-e AWS_ACCESS_KEY_ID=<AWS_ACCESS_KEY_ID> \
-e AWS_SECRET_ACCESS_KEY=<AWS_SECRET_ACCESS_KEY> \
-e DOCKER_CONFIG=/fusion/s3/s3-bucket/workspace/9a3c69098bd07c17_1 \
--platform linux/amd64 cr.seqera.io/public/wave/buildkit:ef67f15426f36b72 \
buildctl-daemonless.sh build \
--frontend dockerfile.v0 \
--local dockerfile=/fusion/s3/s3-bucket/workspace/9a3c69098bd07c17_1 \
--opt filename=Containerfile \
--local context=/fusion/s3/s3-bucket/workspace/9a3c69098bd07c17_1/context \
--output type=image,name=docker.io/hrma017/dev:bwa--9a3c69098bd07c17,push=true,oci-mediatypes=true \
--opt platform=linux/amd64 \
--export-cache type=registry,image-manifest=true,ref=docker.io/hrma017/cache:9a3c69098bd07c17,mode=max,ignore-error=true,oci-mediatypes=true,compression=gzip,force-compression=false \
--import-cache type=registry,ref=docker.io/hrma017/cache:9a3c69098bd07c17

munishchouhan avatar Aug 27 '24 09:08 munishchouhan

Also, the environment variables are important. Send me the docker inspect and we can assume that it's the same on k8s.

here is the inspect in file docker_inspect.txt

munishchouhan avatar Aug 27 '24 09:08 munishchouhan

What is this inspect file? In the inspect I see a command different than the one in the docker command line:

"Cmd": [
                "trivy",
                "--quiet",
                "image",
                "--timeout",
                "10m",
                "--format",
                "json",
                "--output",
                "/fusion/s3/s3-bucket/workspace/scan-9f342c61b284/report.json",
                "docker.io/hrma017/dev:bwa--9a3c69098bd07c17"
            ],

jordeu avatar Aug 27 '24 09:08 jordeu

@jordeu apologies, i shared the scan inspect, here is for the build docker_build_inspect.json

munishchouhan avatar Aug 27 '24 10:08 munishchouhan

hi @jordeu Any advice or pointers to move forward with k8s-fuse-plugin?

munishchouhan avatar Aug 29 '24 09:08 munishchouhan

closing in favour of https://github.com/seqeralabs/wave/pull/855

munishchouhan avatar Jun 10 '25 07:06 munishchouhan