jixing0415
jixing0415
> Why did you use relu6 instead of relu after scale layer in blocks?I think the paper used relu.How many batches when you train? Can you release your train.prorotxt and...
> I Notice that you don't add Squeeze-And-Excite in some block which is different from paper. why not add it? I tried to implement the network architecture as described in...
> Did you reproduce the paper's acc? Not yet, I just got 65% accuracy on v3_small and 70% accuracy on v3_large. I think there may be some training strategies that...