test-tube
test-tube copied to clipboard
SlurmCluster without hyperparameters
I'm attempting to train a PytorchLightning model on a slurm cluster, and the PytorchLightning documentation recommends use of the SlurmCluster class in this package to automate submission of slurm scripts. The examples all involve running a hyperparameter scan, however I would like to train just a single model. My attempt at doing so is as follows:
cluster = SlurmCluster()
[...] (set cluster.per_experiment_nb_cpus, cluster.job_time, etc.)
cluster.optimize_parallel_cluster_gpu(train, nb_trials=1, ...)
However, this fails with:
Traceback (most recent call last):
File "train.py", line 67, in hydra_main
train, nb_trials=1, job_name='pl-slurm', job_display_name='pl-slurm')
File "/global/u2/s/schuya/.local/cori/pytorchv1.5.0-gpu/lib/python3.7/site-packages/test_tube/hpc.py", line 127, in optimize_parallel_cluster_gpu
enable_auto_resubmit, on_gpu=True)
File "/global/u2/s/schuya/.local/cori/pytorchv1.5.0-gpu/lib/python3.7/site-packages/test_tube/hpc.py", line 167, in __optimize_parallel_cluster_internal
if self.is_from_slurm_object:
AttributeError: 'SlurmCluster' object has no attribute 'is_from_slurm_object'
Looking at the code, it seems that SlurmCluster.is_from_slurm_object
was never set. This is because I did not pass in a hyperparam_optimizer
, as I did not intend to perform a scan. What is the correct way go about this?
I know this is pretty old but I just stumbled across your question. I'd say you simply pass a HyperparameterOptimizer
which doesn't have any options for optimization, I guess that would be the simplest solution. But I guess that should be noted in the docs somewhere that test tube needs an optimizer to be set.