resnet-rs-keras icon indicating copy to clipboard operation
resnet-rs-keras copied to clipboard

Is width scaling possible with current code?

Open johnypark opened this issue 2 years ago • 2 comments

Hi Sebastian,

First of all, thank you very much for sharing this amazing code. It is easy to use and works great.

I am interested in using resnet-rs in my research, but with width scaling in the paper, as my dataset is quite large and I currently do not have resources to run it for 350 epochs.

I read through the code but have not yet found any arguments for width scaling...

Would you have some advice for me on how width scaling could be done with your code?

Thank you, John

johnypark avatar Apr 27 '22 09:04 johnypark

Hi John, thank you for the kind words.

Width scaling isn't possible out of the box with this repository. I also cannot seem to find it in tensorflow tpu implementation, that this repository is based on.

It is however present in an alterantive tensorflow model garden implementation

depth_multiplier: (...) This argument is also referred to as width_multiplier in (https://arxiv.org/abs/2103.07579).

From what I see self._depth_multiplier is used to scale the filters of STEM block and block group.

Currently, this repository behaves, like depth multiplier is always equal to 1.0. Changing this multiplier will make it impossible to use pretrained weights as the number of filter's will be changed - are you ok with that?

sebastian-sz avatar Apr 27 '22 13:04 sebastian-sz

Hi Sebastian,

Thank you for the link to the model garden. I guess the original authors didn't save the weights from their width-scaled models... kind of a bummer. Well, it is fair given that ResNet was born to take advantage of deep layers, it makes more sense to train it for long epochs. Maybe I can pull off superconvergence with the width-scaled model from scratch. Anyhow, thank you.

johnypark avatar May 02 '22 13:05 johnypark