kaggle_ndsb2017 icon indicating copy to clipboard operation
kaggle_ndsb2017 copied to clipboard

Training time for step_train_nodule_detector.py

Open wojiushishen opened this issue 7 years ago • 5 comments

Dear julian,

I run you code on my gpu(Tesla K10) simulator, but it seems it's very time-consuming. I need over 30h to finish one epoch. How long do you need to finish one epoch? Thanks. 1508254494 1

wojiushishen avatar Oct 17 '17 15:10 wojiushishen

Hello, That is much longer then what could be expected.. What is a K10 simulator ? Is that much slower than a "normal" card ?

juliandewit avatar Oct 21 '17 07:10 juliandewit

Hi Julian

@juliandewit He ran it on Nvidia Tesla K10 GPU's and obviously it takes hours to complete per epoch. @wojiushishen : why don't you try Nvidia GPU cloud High performance computing ? Signup and run it Nvidia GPC has deep learning container and massive performance for large networks

https://www.nvidia.com/en-us/gpu-cloud/

sathyapatel avatar Dec 08 '17 13:12 sathyapatel

Dear Julian

I have the same issue with step2_train_nodule_detector.py, I have GTX 1080 and each epoch takes 80 hours! I commented below line in the code but not too much difference happens.

config.gpu_options.per_process_gpu_memory_fraction = 0.5

The code does not use all gpu utilization(40%) and seems preparing inputs takes too much time.

Thanks in advance for any solution.

ahasanpour avatar Jan 05 '18 09:01 ahasanpour

Change model.fit_generator(train_gen, len(train_files) / 1, ... to model.fit_generator(train_gen, len(train_files) //batch_size, ... It's a problem caused by keras update, the second parameter of model.fit_generator changed from length of training data to iteration of every epoch. Hoping that it solved your problem.

wojiushishen avatar Jan 31 '18 20:01 wojiushishen

I was on Keras 1.X. So if someone has a pull request then I can add it.

juliandewit avatar Feb 01 '18 11:02 juliandewit