bioconductor_docker icon indicating copy to clipboard operation
bioconductor_docker copied to clipboard

Singularity image for bioconductor 3.20

Open reshu23 opened this issue 1 year ago • 12 comments

Dear Team, I work on offline cluster. Best option for me to use bioconductor packages is to download singularity image of collection of bioconductor package and transfer it to my offline cluster. Docker is not supported on cluster. I can see, according to instructions, you mentioned: You can find the Singularity containers collection on this link https://singularity-hub.org/collections/3955.

On this link, singularity container was updated in the year 2021. Can you please suggest me, where can I download singularity image for latest release of bioconductor collection? Thank you.

reshu23 avatar Nov 25 '24 05:11 reshu23

Hey @reshu23, thank you for contacting us. Unfortunately we have not built Singularity images in a while, as the person who started that project has since left the team, and we have not seen consistent demand for singularity images so far, nor do we have a good way for us to test images in a restricted environment. Things are a bit busy right now, but I can try to allocate some effort to get the Singularity build stack working in early 2025, especially if you are willing to test the images produced to make sure they are working as expected in a more restricted HPC environment like yours.

almahmoud avatar Nov 25 '24 19:11 almahmoud

@reshu23 It should be possible for you to build a singularity container in your system, based on the public dockerfiles. I have done this some years ago. I will try to reconstruct the activities and will post, probably next week.

vjcitn avatar Nov 25 '24 20:11 vjcitn

Thank you very much. It will be really helpful. I am willing to test singularity images on my secure system, once it is available.

reshu23 avatar Nov 26 '24 05:11 reshu23

Hi @reshu23 . Sorry for the delayed response. I found some time yesterday and today, and built an image for 3.20, based on our rstudio 3.20 image, to test. It's publicly available via

apptainer remote add --no-login SylabsCloud cloud.sycloud.io
apptainer remote use SylabsCloud
singularity pull --arch amd64 library://almahmoud/almahmoud/bioconductor:3.20

I would tremendously appreciate you testing in your environment if possible, in order to get as many data points as possible about whether/where it works!

almahmoud avatar Apr 01 '25 17:04 almahmoud

Thank you. Sure, I will test it.

reshu23 avatar Apr 02 '25 04:04 reshu23

Hi, I used singularity exec bioconductor3.20 rstudio. But It is not working. I also used R terminal. But I can not find any package. Here is the screen shot. Thank you

Image

Image

reshu23 avatar Apr 02 '25 14:04 reshu23

Hi @reshu23 . rstudio is not an entrypoint, so unlike the convenience script, I don't think exec rstudio works in general. It would be /init I believe if you wanted to launch rstudio. I am glad you were able to get the container running with R. I don't think the expectation that all packages are pre-installed is correct, you still need to do BiocManager::install to get non-base packages.

almahmoud avatar Apr 02 '25 15:04 almahmoud

Thank you. I can not understand, how BiocManager::install should work on offline system. That is the reason, I was expecting all packages are inside the container. I tried it. Please check the screen shot.(inside the container). Nothing is in progress after this point.

Image

reshu23 avatar Apr 08 '25 04:04 reshu23

Another point: For launching rstudio within container, can you please give me example. I am quite sure, I am not using correct steps. Thank you. Here is the screen shot:

Image

reshu23 avatar Apr 08 '25 04:04 reshu23

Hello @reshu23 , thank you for the follow-up. Sorry for not re-reading the original post properly, I had not internalized the "offline" part.

I think there is a misunderstanding of what the docker/singularity containers represent. We do not offer any container that has all packages built in, as that would be a huge container, would require all CRAN dependencies also built-in, not just Bioconductor packages, and would overall be hard to maintain (essentially rebuilding a huge container any time any Bioconductor package or CRAN dependency gets updated).

I think it's doable from a technical perspective, but not something we can make available for general use and maintain long-term as a community feature request (or at least that is my opinion, open to pushback from other developers).

The containers as they are (both docker and the apptainer/singularity derived from it), offer all system dependencies pre-installed, to ensure that binary installations of packages can happen, without the need for any system-level installation. This doesn't mean that all packages are available by default, but that the container can readily install pre-compiled packages, and avoid compilation causing delays or compatibility errors.

My recommendation, likely for your sysadmin or IT department, would be to mirror/clone our Bioconductor (and likely at least partially CRAN's) binary repository within the cluster, and make it available to all cluster users. This would mean that while users run within specific containers, they install packages from a centralized repository, maintained securely in the cluster. This could then be exposed at an internal endpoint behind the firewall/VPN, which could be used by all users of the cluster as a target repository for installations. One could even imagine a full shared (read-only) R library with all packages already uncompressed, if a shared filesystem is readily available. If this is of interest, we could potentially offer some support to facilitate a successful setup.

Regarding the rstudio init error, using /init as the entrypoint is indeed the correct way, so it's seemingly correctly delegating to s6 init layer. It seems the remaining issue here is that you're running in an extremely restricted environment, where even the ephemeral storage mount used by the running container is ending up on a read-only filesystem. RStudio needs to be able to write files to startup, so I believe the source of the error is your host filesystem.

I'm sorry I can't be of more help, but I hope this clarifies my best interpretation of the issues, hoping it helps guide you towards a solution.

almahmoud avatar Apr 08 '25 19:04 almahmoud

Thank you very much. Now, everything is clear to me.

reshu23 avatar Apr 09 '25 11:04 reshu23

OK to close this issue?

vjcitn avatar Oct 28 '25 09:10 vjcitn

Fwiw, we now have (alpha) versions of apptainer containers building, and initial testing has been done with apptainer run --fakeroot --writable-tmpfs oras://ghcr.io/bioconductor/bioconductor-apptainer:devel for the rstudio container or apptainer exec --fakeroot --writable-tmpfs oras://ghcr.io/bioconductor/r-ver-apptainer:devel R for the smaller R shell containers. I encourage anyone interested in singularity/apptainer to test the new containers and provide any feedback to improve them in a new issue!

almahmoud avatar Nov 22 '25 17:11 almahmoud