keras-contrib icon indicating copy to clipboard operation
keras-contrib copied to clipboard

Error in WideResidualNetwork implementation

Open jpizarrom opened this issue 5 years ago • 5 comments

I think the wide rest net is different of the details in https://arxiv.org/abs/1605.07146

I've tried

model = WideResidualNetwork(depth=16, width=1, dropout_rate=0.0, weights=None)

and I got [current] , but I think should be [correct].

The add operation should be between convolutions as you can see in https://github.com/szagoruyko/wide-residual-networks/blob/master/models/wide-resnet.lua#L70 https://github.com/szagoruyko/wide-residual-networks/blob/master/pytorch/resnet.py#L41

"basic - with two consecutive 3 × 3 convolutions with batch normalization and ReLU preceding convolution:" as in https://arxiv.org/abs/1605.07146

current

model keras_contrib applications wide_resnet

correct

model

jpizarrom avatar Apr 05 '19 15:04 jpizarrom

I've plotted the mode from the repo of the paper https://arxiv.org/abs/1605.07146

import torch
from torchviz import make_dot, make_dot_from_trace

from resnet import resnet as pytorch_resnet
f, params = pytorch_resnet(16, 1, 10)
x = torch.rand(1, 3, 32, 32).cuda()
dot = make_dot(f(x, params, mode=False), params=dict(params))
dot.format = 'png'
dot.render('pytorch_resnet')  

I've got pytorch_resnet

jpizarrom avatar Apr 05 '19 19:04 jpizarrom

The first block of your correction has two consecutive batchnorm-relu blocks.

This codebase has also been here for nearly two years without significant updated. If you perceive an issue during training, can you plot the training curves of both the old model and your proposed correction and display the merit of an update.

titu1994 avatar Apr 05 '19 19:04 titu1994

I'm currently working in the model, I will fix my issue with "two consecutive batchnorm-relu blocks."

I will try to plot both curves in the next days

thanks

jpizarrom avatar Apr 05 '19 20:04 jpizarrom

The first block of your correction has two consecutive batchnorm-relu blocks.

This codebase has also been here for nearly two years without significant updated. If you perceive an issue during training, can you plot the training curves of both the old model and your proposed correction and display the merit of an update.

I've fixed and update the model diagram in the description

Thanks

jpizarrom avatar Apr 05 '19 20:04 jpizarrom

Sure if after running a comparison on CIFAR 10/100 we find that the older model performs significantly worse, then you could submit a PR with corrections.

Thanks for taking a look at this.

titu1994 avatar Apr 05 '19 20:04 titu1994