No progress of algorithm
Hi!
I've tried running the code you've provided on MobileNet, the pretrained model, which was given by you as an example. I believe that calculating LUTs went correctly. Then I've tried applying the algorithm with 0,5 latency, by running:
sh scripts/netadapt_mobilenet-0.5latency.sh
but after some time the algorithm stopped proceeding on:
Launch a worker for block 13 ['/usr/local/bin/python', 'worker.py', 'models/mobilenet/prune-by-latency/worker', 'models/mobilenet/prune-by-latency/master/iter_0_best_model.pth.tar', '13', 'LATENCY', '0.033316052734851845', '1', '500', '0', 'latency_lut/lut_mobilenet.pkl', 'data/', '3', '224', '224', 'mobilenet', '0.001'] Update job list: [{'iteration': 1, 'block': 1, 'gpu': 1}, {'iteration': 1, 'block': 2, 'gpu': 2}, {'iteration': 1, 'block': 3, 'gpu': 3}, {'iteration': 1, 'block': 4, 'gpu': 4}, {'iteration': 1, 'block': 5, 'gpu': 5}, {'iteration': 1, 'block': 6, 'gpu': 6}, {'iteration': 1, 'block': 13, 'gpu': 0}] Update available gpu: []
Despite of 3 days of computing it didn't progress more. Could you advise me if something is wrong there?
Regards, Piotrek
It looks like at least one of these workers crashes so the master did not hear back from these workers. You can check the log of these workers to figure out why the workers crashed.