Joshua Z. Zhang comments

Results 257 comments of


                                            Joshua Z. Zhang

trafficstars

why train so much epoch(240epoch) than caffe ssd (12000iter==58epoch)

the original paper take batch size =32, so that's 240 epoch equivalent.

[Error] Label padding width incompatible

There's no matplotlib involved during training, so I'm kind of confused. Can you post the full log please?

[Error] Label padding width incompatible

Can you simply run demo with your new model?

Why 'rescale_grad': 1.0 / len(ctx) if len(ctx) > 0 else 1.0 ?

Normally we would rescale by batch-size, however, In my experiments, the behavior don't scale up when batch size is changed. The division by len(ctx) is a hack to the fact...

Why 'rescale_grad': 1.0 / len(ctx) if len(ctx) > 0 else 1.0 ?

In makeLoss layer, gradients are assigned inside each device, thus effective batch-size is divided by len(ctx)

About the reported performance

The original arXiv paper shows that it's 72.1. Which is almost identical. However, I checked that the author have updated some code, specifically modified some filter size and training hyper-parameters....

Why number of labels has to be greater than 2 in MultiBoxTarget?

Just to make reuse of some temporary buffer without malloc new space.

Why number of labels has to be greater than 2 in MultiBoxTarget?

correct

Evaluation results on several models are lower than reported ones.

@xioryu Do you have time writing out the results to files and use official Matlab code to verify the results?

Evaluation results on several models are lower than reported ones.

I am going to retrain some of the models to be consistent with recent updates. Don't worry.