Faisal Anees
Faisal Anees
- Warn users before submitting large number of jobs - For experiments running large number of parallel jobs, execute jobs in a gradual manner, check for accuracy and terminate earlier...
create higher level library to do hydra tasks
Create terraform scripts to launch aws batch infra for training. Import components 1. Separate queues and compute environments for CPU vs GPU workloads 2. Use launch templates to attach additional...
After installing hydra via pip `pip install hydra-ml==0.3.6` Run a training command with local mode `hydra train -y run.yaml --cloud=local` This error gets raised ``` sh: /Users/faisalanees/.conda/envs/hydra/lib/python3.8/site-packages/hydra/cloud/../docker/local_execution.sh: No such file...
For AWS - make hydra submit jobs to separate queues based on whether CPU or GPU workload