resnet-rs-keras
resnet-rs-keras copied to clipboard
Is width scaling possible with current code?
Hi Sebastian,
First of all, thank you very much for sharing this amazing code. It is easy to use and works great.
I am interested in using resnet-rs in my research, but with width scaling in the paper, as my dataset is quite large and I currently do not have resources to run it for 350 epochs.
I read through the code but have not yet found any arguments for width scaling...
Would you have some advice for me on how width scaling could be done with your code?
Thank you, John
Hi John, thank you for the kind words.
Width scaling isn't possible out of the box with this repository. I also cannot seem to find it in tensorflow tpu implementation, that this repository is based on.
It is however present in an alterantive tensorflow model garden implementation
depth_multiplier: (...) This argument is also referred to as
width_multiplier
in (https://arxiv.org/abs/2103.07579).
From what I see self._depth_multiplier
is used to scale the filters of STEM block and block group.
Currently, this repository behaves, like depth multiplier is always equal to 1.0. Changing this multiplier will make it impossible to use pretrained weights as the number of filter's will be changed - are you ok with that?
Hi Sebastian,
Thank you for the link to the model garden. I guess the original authors didn't save the weights from their width-scaled models... kind of a bummer. Well, it is fair given that ResNet was born to take advantage of deep layers, it makes more sense to train it for long epochs. Maybe I can pull off superconvergence with the width-scaled model from scratch. Anyhow, thank you.