test-tube icon indicating copy to clipboard operation
test-tube copied to clipboard

example/tensorflow_example.py had an exception (about dir making) when nb_workers > 1

Open leocnj opened this issue 5 years ago • 1 comments

I tried the tensorflow_example.py to test the function of using multiple GPUs. When setting up nb_workers more than 1, I met an exception as follows. As a result, I just got nb_trials - 1 tuning results, rather than the expected nb_trials. Note that I am using Python 3.6.

Caught exception in worker thread [Errno 17] File exists: 'logs/multigpu/test_tube_data/dense_model/version_0' Traceback (most recent call last): File "/home/lchen/.local/lib/python3.6/site-packages/test_tube/argparse_hopt.py", line 30, in optimize_parallel_gpu_private results = train_function(trial_params) File "test_tube_multigpu.py", line 22, in train autosave=False File "/home/lchen/.local/lib/python3.6/site-packages/test_tube/log.py", line 58, in init self.__init_cache_file_if_needed() File "/home/lchen/.local/lib/python3.6/site-packages/test_tube/log.py", line 121, in __init_cache_file_if_needed os.makedirs(exp_cache_file) File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/os.py", line 220, in makedirs mkdir(name, mode)

leocnj avatar Jul 18 '18 15:07 leocnj

Just tested on Python 2.7 and met the same issue if using more than one GPU.

leocnj avatar Jul 18 '18 15:07 leocnj