scivision icon indicating copy to clipboard operation
scivision copied to clipboard

Investigate usefulness of creating a scivision docker image

Open edwardchalstrey1 opened this issue 2 years ago • 3 comments

@acocac noticed with a scivision plugin package he is creating that a because a requirement is detectron2, he needed things like PyTorch and torchvision pre-installed for this to work correctly.

I wonder if in the long run, we should also have a scivision docker image as well as the python package, which includes the python package, but also useful stuff like CUDA and PyTorch etc that are difficult to install?

@ots22 I would be interested to hear your opinion on this

edwardchalstrey1 avatar Dec 10 '21 15:12 edwardchalstrey1

Can see that something like this might be useful - here are some thoughts!

  • Scivision models are separate from 'core' scivision, and can be completely arbitrary. A scivision image (or images) would have to make choices about which libraries to include, and their versions, which would necessarily not support every model!

  • The problem seems to be that we require a scivision model to be pip installable, but this might not be enough for system dependencies. Normally the model code would be responsible for describing its dependencies, but this responsibility is taken over by core scivision.

  • How would it fit with the scivision workflow we have in mind? It seems like we want scivision to let people quickly try out a number of models. Supporting working outside a container seems fairly crucial to me (even if we do offer one or more of them).

  • Related to the above: Are there any guarantees we can offer a user or a model author, if they use the image? Is there a way for a model to indicate that it is compatible with an image? (Something for the catalogue?)

  • There are many projects maintaining data science Docker images, and libraries often provide their own: we should make sure we build on these, to avoid taking on a big maintenance burden.

  • GPUs can still be hard, even with containers!

  • Consider Singularity containers as well/instead, for better HPC support?

  • Deploying the front-end or examples showcase could be a good reason to have one (and it's also clear what belongs inside)

ots22 avatar Jan 13 '22 11:01 ots22

Scivision models are separate from 'core' scivision, and can be completely arbitrary. A scivision image (or images) would have to make choices about which libraries to include, and their versions, which would necessarily not support every model!

We were discussing yesterday making scivision not so completely model-agnostic, and instead offering an interface to a number of popular frameworks for model authors to build on. We could make sure these are supported by an image, at a minimum.

@miquelmassot

ots22 avatar Jan 13 '22 11:01 ots22

TL;DR A docker image with a pip install instruction is not a big deal. If someone needs one, expect more background knowledge. Target novice users first with a very easy to use python package that is pip-installable and interfaced to the most common AI frameworks and network topologies.

I would not reinvent the wheel. There are already many existing docker images with PyTorch or TensorFlow (& GPU support) that can be used as a base image to install scivision into. These docker images can (normally!) be easily converted to Singularity, and it would not enforce anyone to use Singularity if they're not familiar. A well-developed python package is, by itself, a self-contained declaration of requirements, and thus very easy to install via pip. That's what an entry-level user would do, open their terminal, pip install scivision and try running some AI / ML models, from my point of view. The most popular AI models should be easy to use and easy to interface, and I would push onto their developers the role of actually packaging their software well. Take for example detectron2, if scivision interfaces it, that's enough. I would expect a user to be able to pip install both scivision and detectron2 to get started. If a brand new AI package comes it's not scivision job to package it, TBH. If users require docker images, singularity, HPC... that's a completely different target and base knowledge. I would expect more background knowledge in python and HPC from them.

miquelmassot avatar Jan 13 '22 20:01 miquelmassot