Devin Yang comments

Results 22 comments of


                                            Devin Yang

how to train ghostnet?

Hi @iamhankai thanks for sharing this good work. I success training GhostNet 1.3x to `75.78/92.77 top1/top5`, it's almost your paper mentioned. Details [here](https://github.com/PistonY/ModelZoo.pytorch) But I use the same training setting...

Could you please tell us some details about GhostNet-BCD settings?

And you not specify `dw_kernel_size` for GhostNet in paper Table 7, are you using `dw_kernel_size=3` by default?

Could you please tell us some details about GhostNet-BCD settings?

Hi @iamhankai, thanks for reply and open source. But I still confuse about how to get TinyGhostNet-X(B/C/D/E) from TinyGhostNet-A, are using using same param with EfficientNet?

Does LambdaLayer need BatchNorm and activation after it?

Hi @lucidrains, thanks for reply. I tested them all, I think you're right. When apply `bn+relu` the val accuracy doesn't grow. This is my final [implement](https://gist.github.com/PistonY/ad33ab9e3d5f9a6a38345eb184e68cb4). Now I'm training `LambdaResnet50`,it's...

Does LambdaLayer need BatchNorm and activation after it?

@lucidrains Unfortunately, I only got 76.1 best top1 on val set(79.2 on train set). I'd better wait author release their code.

How large memory is required for the experiment

I try out `LambdaResnet50` with 64 batch_size about cost 9-10GB gpu memory in FP32 precision,it's much larger than `Resnet50`

Merge AMD-HIP port

M@sriharikarnam Does this has any progress?Is this work still going on?

Has the project been deprecated?

Same question

influence of the batch size and the number of GPUs

Recently I use [distribute train](https://github.com/PistonY/ModelZoo.pytorch/blob/master/scripts/distribute_train_script.py) more often. You need to make sure single gpu has same batch size with me, you should get same result but may take more time...

influence of the batch size and the number of GPUs

No need to change I think. This paper should mean batch size on one device, normally batch size in paper just mean on device hold, take care of the difference...