He Ma

Results 37 comments of He Ma

This problem was also mentioned in #32

The "ZMQError: Address in use" error happens when the previous run failed and the socket port opened in the previous run was not closed properly causing port conflict in the...

@aryanbhardwaj , Your training cost looks okay so far. Are you training on ImageNet data? If you follow the preprocess steps in this project, you will see 5004 batch files...

@aryanbhardwaj This preprocessing setup is for doing multi-GPU training. Specifically, single GPU trains with batch_size=256, two GPUs train with batch_size=128 on each GPU, and 4 GPUs will train with batch_size=64...

@aryanbhardwaj We benchmarked training speed on GTX 1080 and Tesla K80. For GTX 1080, it takes 0.91h per epoch. For Tesla K80, it takes 1.96h per epoch. Totally 60 epochs,...

@aryanbhardwaj Yes, data pipeline would be the first to check. Verify that your training data matches the training labels. The cost not decreasing issue could be due to a bad...

@aryanbhardwaj Interesting. I haven't tried that yet. But I imagine that would require the object to be in some ratio range with respect to the image size as the way...