[ENH] - Include hardware requirements in the conda project file
I'm thinking what it might be like to run an app on Nebari which a batteries included Jupyterhub designed to run on Kubernetes or Slurm. In that use case, hardware requirements would need to be specified for the apps to run along with the dependencies in the conda environment. Thinking of where that might fit into the conda project file, I think hardware requirements could be handled similarly to the conda environments. An example of the requirements I might see being useful for an ML focused conda project is below.
name: extended-conda-project
hardware_profiles:
default:
gpu:
memory: 4Gi
cpu: 2000mi
memory: 8Gi
storage: 10Gi
training:
gpu:
memory: 16Gi
cpu: 4000mi
memory: 16Gi
storage: 10Gi
inference:
gpu:
memory: 4Gi
cpu: 2000mi
memory: 8Gi
environments:
default:
- environment.yml
training;
- training-environment.yml
validation:
- validation-environment.yml
variables:
FOO: bar
commands:
# Setup command
setup:
cmd: python download_datasets.py
# Regular commands
train_model:
cmd: python train.py
environment: default
hardware_profile: training
inference:
cmd: python inference.py
environment: default
hardware_profile: inference
Would conda-project consider supporting hardware requirements or is that out of scope?
That sounds very useful. Is there something you'd like conda-project itself to do about it? I imagine that nebari would parse this info and take special action on or before conda project run <cmd>
I had done something similar as an experiment with anaconda-project where I was able take the project yaml files (with hardware configurations as you have done), make a docker image, build a helm chart (from a template), and install/deploy it for a defined command. Would that also be useful?
That sounds very useful. Is there something you'd like conda-project itself to do about it? I imagine that nebari would parse this info and take special action on or before
conda project run <cmd>
Yes, nebari (or other tools) would be parsing it prior to conda project run <cmd>. I can't think of anything we'd need conda-project to do with the info. I guess it could maybe run some validation against the current hardware and raise a warning if the requirements aren't met, but I can't think of much else at the moment. I guess it could potentially be useful to pass in the hardware profile info into an action somehow, but that's speculation more than needing it for my immediate use case.
I had done something similar as an experiment with anaconda-project where I was able take the project yaml files (with hardware configurations as you have done), make a docker image, build a helm chart (from a template), and install/deploy it for a defined command. Would that also be useful?
That reminds me of MLflow Projects. Off topic, but maybe passing in parameters to the commands could be useful as well. Maybe that's how you could pass in the hardware spec to a command if needed like I suggested above.
As far as the docker containers, I don't think the docker container building functionality in the Nebari use case, but maybe useful for others.