tools icon indicating copy to clipboard operation
tools copied to clipboard

Issues/3466 Move away from Gitpod

Open JulianFlesch opened this issue 7 months ago • 13 comments

Description

Addresses the suggestions in issue 3466 to move away from gitpod and towards devcontainer for Github codespaces.

PR checklist

  • [x] This comment contains a description of changes (with reason)
  • [x] CHANGELOG.md is updated
  • [ ] If you've fixed a bug or added code that should be tested, add tests!
  • [ ] Documentation in docs is updated

JulianFlesch avatar May 14 '25 13:05 JulianFlesch

Note: This still uses (a newer) gitpod base image and thus also the gitpod user.

JulianFlesch avatar May 20 '25 10:05 JulianFlesch

Codecov Report

:white_check_mark: All modified and coverable lines are covered by tests. :white_check_mark: Project coverage is 77.74%. Comparing base (15aeedd) to head (285a2a1). :warning: Report is 174 commits behind head on dev.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

codecov[bot] avatar May 20 '25 10:05 codecov[bot]

Changing this PR back to Draft to explore different base images:

This compares the Dockerfiles in nf_core/devcontainer/. Currently the PR is a work in progress set up to use the approach based on the devcontainer base image.

Size comparison

local name full base image size
nfcore/devcontainer-gitpod gitpod/workspace-base @ sha256:7f35e401405c38ebc1185dfc3d6c066f73a524e8b86641453ea4690cc5c24fb6 4.46GB
nfcore/devcontainer mcr.microsoft.com/ devcontainers/python : 3.11 3.34GB
nfcore/seqeracontainer community.wave.seqera.io/ library/nextflow_nf-test_apptainer_curl_pruned : eeb39fa4eeaed3f3 2.81GB

The micromamba base image used by seqera containers wins wrt size.

Docker comparison Apart from size the three approaches mainly differ in how docker is installed: In the seqeracontainer base image, docker-outside-of-docker is realized by manually installing via curl -sSL https://get.docker.com/ | sh, which requires mounting the host /var/run/docker.sock.

The gitpod base image takes care of installing docker for us and with the devcontainer python image I tried installing it as a feature. I wasn't able to make either of the latter docker-in-docker approaches actually work so far.

JulianFlesch avatar May 20 '25 18:05 JulianFlesch

Do you have an opinion on the base images and how we use docker @mashehu @mahesh-panchal ? Or should I just go with the smallest?

Also: Do we want / need an unprivileged user in the container or not?

JulianFlesch avatar May 23 '25 09:05 JulianFlesch

Ideally it should be as small as possible. I think the priority is to minimize how much maintenance we need to do our side so as long as the base images are going to be widely stable, then it's just a matter of what supports the stuff we need the best.

Does the base devcontainer image have anything special that the seqera base container doesn't? Do features still work well on the Seqera base container.

Ideally the user should be unprivileged. There should be no features we need that require privileged access, but I seem to recall it being a necessity for something. However we don't use the image to build images. It's mostly to manually run tests, and potentially run container images to investigate how to extract version strings or see the tool help text.

mahesh-panchal avatar May 23 '25 09:05 mahesh-panchal

Ideally it should be as small as possible. I think the priority is to minimize how much maintenance we need to do our side so as long as the base images are going to be widely stable, then it's just a matter of what supports the stuff we need the best.

The smallest one was build with Seqera containers, which uses the well supported micromamba image. (~280 GH Stars, 20 contributors, updates this week) But it's not much smaller than the devcontainers image ...

Ideally the user should be unprivileged. There should be no features we need that require privileged access, but I seem to recall it being a necessity for something. However we don't use the image to build images. It's mostly to manually run tests, and potentially run container images to investigate how to extract version strings or see the tool help text.

Probably it was for docker-in-docker, right? With the Seqera base image we would just need an unprivileged user and run docker-outside-of-docker using the host deamon. (Except being able to run Docker always means you're privileged 🙄 )

For your testing purposes would running docker-outside-of-docker defeat the purpose of having docker installed in the image in the first place @mahesh-panchal ?

JulianFlesch avatar May 23 '25 11:05 JulianFlesch

For your testing purposes would running docker-outside-of-docker defeat the purpose of having docker installed in the image in the first place @mahesh-panchal ?

The important thing is that it's seemless to the user. The user should be able to open Codespaces, or run the devcontainer locally and have access to docker. I have no idea of what the implications or running inside or outside the devcontainer image would be, or if they're important.

mahesh-panchal avatar May 23 '25 11:05 mahesh-panchal

Ok I will just refine the solution with devcontainers then and ping you for a review. For one I am not sure that docker-outside-of-docker will even work on codespaces and its definitely more work/responsibility for the user.

JulianFlesch avatar May 23 '25 11:05 JulianFlesch

Bit late to the party here, but note that we went through this process when switching to Devcontainers for the Nextflow Training. In the end we went all-in on the devcontainer spec and stopped thinking about base images entirely.

See the current Devcontainer spec: https://github.com/nextflow-io/training/blob/master/.devcontainer/codespaces-dev/devcontainer.json

  • Uses mcr.microsoft.com/devcontainers/base:ubuntu the simplest base Devcontainer image
  • Uses devcontainer features instead of Docker layers for stuff like Java, conda etc.
  • Uses local features (basically bash scripts) to install stuff
  • Bungs all of this into a pre-build container image for faster start-up times (workflow)

I'm not sure that we need anything in the devcontainer image that we haven't already sorted in the training image? We already have Nextflow, conda, nf-core/tools etc etc. The simplest approach would be to forget about image entirely and just use this devcontainer spec file with the training base image but mounting a different repo / workdir.

If the image being called "training" feels off then we could always move that. Feels more sensible to share effort on the setup there rather than reinventing the wheel though..

ewels avatar May 27 '25 09:05 ewels

@ewels Yep, devcontainer features are great! We are using those and docker-in-docker works like a charm on codespaces 🚀 (We are now using mcr.microsoft.com/devcontainers/python:3.11) I can add conda and nextflow as a feature as well, like it was done over at nextflow.io/testcontainers.

Just using the devcontainer from nextflow.io/training would not install the most recent version of tools after a merge, which is the purpose of this exercise, no?

But we can use their base image and reinstall tool with dev dependencies and --editable flag set?

JulianFlesch avatar May 27 '25 11:05 JulianFlesch

Our image for use with devcontainer is now based on ghcr.io/nextflow-io/training:latest to remove a lot of redundancy and hopefully maintenance effort. The setup.sh had to be added to make that base image work.

The devcontainer.Dockerfile sets up the image for use with the non-root user vscode, installs apptainer and most importantly copies the current state of the tools repository for installation.

If we don't need the unprivileged user (the tutorial image just uses root) and the repository can be mounted into the container, we could move those things to the setup.sh script and drop our own devcontainer image entirely.

JulianFlesch avatar May 28 '25 10:05 JulianFlesch

any update on this? I keep pointing people to codespaces and then the run into docker-in-docker issues...

mashehu avatar Jun 16 '25 12:06 mashehu

I was messing around using pixi in a container to install needed tools. Seems to work OK. I haven't got it fluid yet though.

https://github.com/mahesh-panchal/Nextflow_sandbox

Feel free to play around. Just FYI.

mahesh-panchal avatar Jun 26 '25 15:06 mahesh-panchal

Major Updates here:

  • After more discussion, we decided to move away from the vscode user and also use root. Most importantly because:
    • The vscode user already must already be privileged and can use sudo anyway. Also there are not really any security implications here
    • Running docker always needed extra steps
    • Setup becomes simpler and stable for different use cases
    • File permissions are handled for us anyways
    • ...
  • With this and by installing apptainer as feature (and fixing it in the parent image) we can entirely drop having our own Dockerfile and just rely on the parent image from nextflow-io/training
  • The docker-in-docker feature is also dropped in favor of docker-outside-of-docker from the parent container. This was interfering and causing headaches :(

I tested locally and on codespaces, but please try this out!

@mahesh-panchal With these changes I am trying to align our efforts with existing base images and build features on top, instead of having our own Dockerfile. What is your stance on that?

JulianFlesch avatar Jul 17 '25 08:07 JulianFlesch

ToDo is updating the workflows. @mashehu do we want to keep pushing our own image? (That would obviously remove redundancy e.g. in modules and the template) or is it acceptable to have people "build" the devcontainer (i.e. running the feature installs)?

JulianFlesch avatar Jul 17 '25 09:07 JulianFlesch

Nope, let's keep pushing our own

mashehu avatar Jul 17 '25 09:07 mashehu

I'm currently doing extensive testing to check that the Github workflows correctly build our devcontainers. This seems to go well for the most part, except I noticed the installation of tools in the parent image (ghcr.io/nextflow-io/training) interferes with our attempt to install specific versions.

These features cannot trivially be "undone" later afaik, so solve this I will drop this base image in favor of mcr.microsoft.com/devcontainers/base (ubuntu) but copy and adapt the local features they use for installing apptainer, conda, nextflow and docker from the nextflow training image. We can then have control over which tools version we want to install.

JulianFlesch avatar Jul 21 '25 13:07 JulianFlesch

Ready for review, if you have the time @mashehu @mahesh-panchal ! 🎉

Summary of latest changes:

  • A way to retry builds for the nf-Core/tools container was added due to nf-test installer download sometimes failing (issue)
  • Installation of nf-test in the devcontainer image was changed to conda, since conda is needed anyway (commit)
  • All images are now build for linux/amd64 and linux/arm64 architectures (this requires QEMU and buildx)
  • Most requirements are installed via (local) features now, with the exception of nf-core tools because it requires the current state of the repo
  • There are thus two devcontainer configs: everything in .devcontainer/build-devcontainer is for building an image with a specific version of tools and all features installed. The config ./devcontainer/devcontainer.json is used to run it in vscode / codespaces.

Failing Test

To make the failing test work, I assume something in template_features.yml needs to be changed?

JulianFlesch avatar Jul 22 '25 15:07 JulianFlesch

The new template file .devcontainer/setup.sh should be skipped by a template feature. We should remove the current gitpod feature and update the codespaces feature to add the new file, in nf_core/pipelines/create/template_features.yml

mirpedrol avatar Jul 28 '25 09:07 mirpedrol

@mirpedrol

The new template file .devcontainer/setup.sh should be skipped by a template feature. We should remove the current gitpod feature and update the codespaces feature to add the new file, in nf_core/pipelines/create/template_features.yml

I removed the gitpod feature from nf_core/pipelines/create/template_features.yml, but kept the new template file .devcontainer/setup.sh to make sure the jinja variable is rendered.

JulianFlesch avatar Aug 01 '25 13:08 JulianFlesch

Can you especially have a look at my changes to template_features.yaml (see d3a465b) if the new files need to be added anywhere else @mirpedrol ?

Also to fix the tests after these changes I updated the snapshots from file creation with --snapshot-update (see 6d8c9ba). Is this the right way to do this?

JulianFlesch avatar Aug 04 '25 14:08 JulianFlesch

Small comment regarding 3008aae: I removed the use of $localWorkspaceFolder (which was introduced to make docker with mounts from the host possible and easy), because it is not working in codespaces.

When running docker locally in the devcontainer (e.g. from vscode on a personal machine) this is now still possible with -v $LOCAL_WORKSPACE_FOLDER:$LOCAL_WORKSPACE_FOLDER but will require additional configuration when running nextflow pipelines or nf-test with the docker profile.

JulianFlesch avatar Aug 08 '25 12:08 JulianFlesch

Hi, I'm only just back from vacation. What should I test and review at the moment, and what has been tested?

mahesh-panchal avatar Aug 21 '25 10:08 mahesh-panchal

Welcome back and thanks for having a look @mahesh-panchal

This PR changed quite a bit, but this update still sums it up well: https://github.com/nf-core/tools/pull/3569#issuecomment-3103466521

The main things I could use feedback on:

  1. Are all required usecases covered by this devcontainer?
  2. ~~Do you agree with or have a strong opinion on the choice of base image? The latest change to using the miniconda image was due to different python version interfering in the final container, as discussed here~~ (EDIT: IMHO now certainly the most optimal way to install conda and the correct python version and all other requirements)
  3. ~~Do you have an opinion on the devcontainer features used and how they play with the Dockerfile? The decision to have a Dockerfile again was mainly to be able to copy the repo at the correct ref when running our Github workflows and install the intended state/version of tools.~~ (EDIT: Changed to minimal number of features)
  4. ~~Specifically the local nextflow feature. I will investigate this further, but I believe this adds quite some size to the container potentially due to changing in how we install nf-test (https://github.com/nf-core/tools/pull/3569/commits/09a1bb8e1a1253cd0d68573b751b79fcd5cddff8, https://github.com/nf-core/tools/pull/3569/commits/4c758baf53c4418697cf72542740ead8e746c900) which moved from using their install link to conda due to frequent download failures (https://github.com/nf-core/tools/issues/3684).~~ (EDIT: large size of nextflow/nf-test feature was adressed in most recent changes)
  5. Confirmation that the new build workflows (for dev and main) are functioning not just from my fork?

EDITED after 6c1596c changes

JulianFlesch avatar Aug 22 '25 14:08 JulianFlesch

Final changes to address the size of the resulting devcontainer

Problem 1: The features added a lot of size (roughly 300MB each. >1GB for the local nextflow feature!) Problem 2: Build dependencies / caches were not cleaned up well enough

Changes:

  • Nextflow, nf-test and apptainer is moved back to the Dockerfile
  • A two-stage build is used in the Dockerfile to improve the space usage.

Here are some numbers about the Dockerfile (.devcontainer/build-devcontainer/Dockerfile) based on manual builds:

  • Size of Dockerfile before improvements: 4.24GB
  • Size of Dockerfile with cleanups fce08cf: 3.86GB
  • Size of Dockerfile with nextflow/nf-test install 04c14dd: 5.54GB
  • Size of Dockerfile with nextflow/nf-test install and 2stage build 5fb202d: 4.62GB
  • Size of Dockerfile with nextflow/nf-test/apptainer install and 2stage build b4335b2: 4.82GB

And the resulting devcontainer (.devcontainer/build-devcontainer/devcontainer.json):

  • Devcontainer size before improvements: 7.82GB
  • Devcontainer size after 6c1596c: 5.21GB

  • Devcontainer size after 285a2a1: 4.11GB

JulianFlesch avatar Sep 01 '25 20:09 JulianFlesch

Good investigation. 4gb still seems a lot too me. But nothing we have to fix in this PR. But maybe make an issue about it.

mashehu avatar Sep 02 '25 06:09 mashehu

I got this error when opening this PR in Codespaces.


=================================================================================
2025-09-02 07:52:05.549Z: Configuration starting...
2025-09-02 07:52:05.560Z: Cloning...

=================================================================================
2025-09-02 07:52:06.549Z: Creating container...
2025-09-02 07:52:06.594Z: $ devcontainer up --id-label Type=codespaces --workspace-folder /var/lib/docker/codespacemount/workspace/nf-core-tools --mount type=bind,source=/.codespaces/agent/mount/cache,target=/vscode --user-data-folder /var/lib/docker/codespacemount/.persistedshare --container-data-folder .vscode-remote/data/Machine --container-system-data-folder /var/vscode-remote --log-level trace --log-format json --update-remote-user-uid-default never --mount-workspace-git-root false --omit-config-remote-env-from-metadata --skip-non-blocking-commands --skip-post-create --config "/var/lib/docker/codespacemount/workspace/nf-core-tools/.devcontainer/devcontainer.json" --override-config /root/.codespaces/shared/merged_devcontainer.json --default-user-env-probe loginInteractiveShell --container-session-data-folder /workspaces/.codespaces/.persistedshare/devcontainers-cli/cache --secrets-file /root/.codespaces/shared/user-secrets-envs.json
2025-09-02 07:52:06.726Z: @devcontainers/cli 0.80.0. Node.js v18.20.6. linux 6.8.0-1030-azure x64.
2025-09-02 07:52:09.211Z: Error: Command failed: docker inspect --type image nfcore/devcontainer:latest
2025-09-02 07:52:09.211Z: {"outcome":"error","message":"Command failed: docker inspect --type image nfcore/devcontainer:latest","description":"An error occurred setting up the container."}
2025-09-02 07:52:09.212Z:     at w6 (/.codespaces/agent/bin/node_modules/@devcontainers/cli/dist/spec-node/devContainersSpecCLI.js:467:1253)
2025-09-02 07:52:09.212Z:     at ax (/.codespaces/agent/bin/node_modules/@devcontainers/cli/dist/spec-node/devContainersSpecCLI.js:467:997)
2025-09-02 07:52:09.213Z:     at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
2025-09-02 07:52:09.213Z:     at async Y6 (/.codespaces/agent/bin/node_modules/@devcontainers/cli/dist/spec-node/devContainersSpecCLI.js:484:3842)
2025-09-02 07:52:09.215Z:     at async BC (/.codespaces/agent/bin/node_modules/@devcontainers/cli/dist/spec-node/devContainersSpecCLI.js:484:4957)
2025-09-02 07:52:09.215Z:     at async p7 (/.codespaces/agent/bin/node_modules/@devcontainers/cli/dist/spec-node/devContainersSpecCLI.js:665:202)
2025-09-02 07:52:09.215Z:     at async d7 (/.codespaces/agent/bin/node_modules/@devcontainers/cli/dist/spec-node/devContainersSpecCLI.js:664:14804)
2025-09-02 07:52:09.215Z:     at async /.codespaces/agent/bin/node_modules/@devcontainers/cli/dist/spec-node/devContainersSpecCLI.js:484:1188
2025-09-02 07:52:09.218Z: devcontainer process exited with exit code 1

====================================== ERROR ====================================
2025-09-02 07:52:09.221Z: Failed to create container.
=================================================================================
2025-09-02 07:52:09.221Z: Error: Command failed: docker inspect --type image nfcore/devcontainer:latest
2025-09-02 07:52:09.224Z: Error code: 1302 (UnifiedContainersErrorFatalCreatingContainer)

====================================== ERROR ====================================
2025-09-02 07:52:09.232Z: Container creation failed.
=================================================================================
2025-09-02 07:52:09.281Z: 

and it opened a recovery container instead.

mahesh-panchal avatar Sep 02 '25 07:09 mahesh-panchal

To test on the nf-core modules repo, which files are the important ones I should transfer to a branch?

mahesh-panchal avatar Sep 02 '25 08:09 mahesh-panchal

@mahesh-panchal The error is because the image is not yet pushed to Dockerhub. This will work once the first build action (after a pull to main) is run. For now, you have to run the devcontainer build command for .devcontainer/build-devcontainer/devcontainer.json manually:

$ devcontainer build --workspace-folder . --image-name nfcore/devcontainer:latest --config .devcontainer/build-devcontainer/devcontainer.json

The devcontainer .devcontainer/devcontainer.json can then be run locally. For testing in codespaces, you need to upload the manually built devcontainer to a registry and change the "image": declaration in the config .devcontainer/devcontainer.json.

EDIT: I built and pushed our base devcontainer image (.devcontainer/build-devcontainer/devcontainer.json) to a registry: fljulian/nfcore-devcontainer:latest. You can use this in .devcontainer/devcontainer.json for testing instead.

EDIT2: You can test this repo in codespaces here: https://github.com/JulianFlesch/nf-core-tools/tree/testing/issue-3266-gitpod based on the image mentioned above

JulianFlesch avatar Sep 02 '25 08:09 JulianFlesch

To test on the nf-core modules repo, which files are the important ones I should transfer to a branch?

The files .devcontainer/devcontainer.json and .devcontainer/setup.sh I would assume. Some testing was done for this issue (https://github.com/nf-core/modules/issues/8769) on an earlier version of this PR.

EDIT: Here is a draft PR: https://github.com/nf-core/modules/pull/8983

JulianFlesch avatar Sep 02 '25 08:09 JulianFlesch