pytorch2keras icon indicating copy to clipboard operation
pytorch2keras copied to clipboard

Porting inception based architectures (inception v4)

Open snakers4 opened this issue 6 years ago • 6 comments

Hi, once again thanks for awesome work. It really helps with shortening pytorch to production path (pytorch=>keras=>tf).

I was running my pytorch model based on inception4 architecture.

I encountered this error during the run:

ValueError: Unsuported padding size for convolution

I guess this is due to my architecture being an inception4 architecture - it has non-symmetric filters and this is solved differently in keras and pytorch. In keras it is solved via 'same' convolutions, in pytorch - via different paddings. Investigating deeper - I indeed found out that conv layers with assymmetric padding in pytorch are the culprits.

Just compare these links:

  • https://github.com/Cadene/pretrained-models.pytorch/blob/master/pretrainedmodels/inceptionv4.py
  • https://github.com/kentsommer/keras-inceptionV4

I overcame this by essentially a hack, but maybe you will give some advice / commit on how to do it properly?

In a nutshell I did this. I can do a PR with this change, if you want.

        if node.padding[0] != node.padding[1]:
            # originally this line was not commented
            # raise ValueError('Unsuported padding size for convolution')
            
            # quick fix for inception architectures
            # refer here for more info https://github.com/fchollet/keras/blob/master/keras/applications/inception_v3.py
            border_mode = 'same'
        else:
            # this code initially was under no condition
            padding = node.padding[0]
            if padding > 0:
                padding_name = output_name + '_pad'
                padding_layer = keras.layers.ZeroPadding2D(
                    padding=node.padding,
                    name=padding_name
                )
                layers[padding_name] = padding_layer(layers[input_name])
                input_name = padding_name      
                
            # this line below also was applied unconditionally
            border_mode = 'valid'

Anyway - which proper solution would you suggest for this edge case?

snakers4 avatar Dec 06 '17 05:12 snakers4

After fixing the concat bug, I managed to convert my model. The whole model gives reasonable discrepancy (sotmax being the last layer), but I still suspect it is not enough given your really small discrepancies in your models w/o softmax.

So to check what is wrong I tested just the inception4 encoder conversion with the above hack. It produced this

Max error: 4.279047966003418

I used this inception implementation:

  • https://github.com/Cadene/pretrained-models.pytorch/blob/master/pretrainedmodels/inceptionv4.py

The encoder that I am using:

class InceptionEncoder(nn.Module):
    def __init__(self,
                 inception):
        
        super(InceptionEncoder, self).__init__()

        self.inception_extractor = InceptionExtractor(inception)    

    def forward(self, x):
        x,x2,x3,x4 = self.inception_extractor(x)
        out = torch.cat( (x,x2,x3,x4), dim=1)
        return out

class InceptionExtractor(nn.Module):
    def __init__(self,
                 inception):
        super(InceptionExtractor, self).__init__()
        self.stem = nn.Sequential(
            inception.features[0],
            inception.features[1],
            inception.features[2],
            inception.features[3],
            inception.features[4],
            inception.features[5],
        )   
        self.inception1 = inception.features[6]
        self.inception2 = inception.features[7]
        self.inception3 = inception.features[8]
        self.inception4 = inception.features[9]    
    def forward(self, x):
        x = self.stem(x)
        x = self.inception1(x)
        x2 = self.inception2(x)
        x3 = self.inception3(x2)
        x4 = self.inception4(x3)        
        return x,x2,x3,x4  

The params I am using to invoke the model assuming the above inception4 implementation

inception4 = inceptionv4(num_classes=1000, pretrained='imagenet')
model = InceptionEncoder(inception4,**model_params)

So, I assume that my hack for Inception architectures is somewhat buggy / needs to be fixed somehow. Any help / advice appreciated.

snakers4 avatar Dec 06 '17 08:12 snakers4

Hello, @snakers4.

I've tested the Inception v4 today, It seems more or less accurate but not all PyTorch parameters have Keras/TF equivalent. It seems like AvgPooling parameter count_include_pad may cause wrong result with padding.

I also tested your InceptionEncoder too (with average error ~2.91038e-10), you can check it out right there.

gmalivenko avatar Dec 06 '17 20:12 gmalivenko

Hi @nerox8664,

Many thanks for your effort and support! Pulled your repo anew.

Did the following tests:

  • plain inceptionv4 w/o imagenet pre-training (your tests/inception.py) - gave less than 0.0, which is ok because our model contains no softmax
  • my inception encoder + no imagenet pre-training (your gist) Max error: 4.94765117764473e-10
  • my inception encoder + imagenet pre-training + my initialization Max error: 0.00026 - which is good enough given no softmax
  • my inception encoder + imagenet pre-training + my initialization + count_include_pad = False Max error: 0.001
  • my full trained model (image-net + my training + my initialization + other layers) - Max error: 0.19
  • my full trained model +count_include_pad = False Max error: 0.18

So now I can reasonably assume that inception base works just fine, and there is no issue with the encoder part - only my model produces worse results, and only after loading weights.

Unfortunately, I cannot share the weights, but maybe you can give a hint based on model architecture?

class IlgSimplifiedNormalized(nn.Module):
    def __init__(self,
                 inception,
                 num_classes=2,
                 num_skip1=256,
                 num_skip2=256,
                 num_skip3=256,
                 num_skip4=256,
                 num_filters=256,
                 num_fmap=7):

        super(IlgSimplifiedNormalized, self).__init__()

        self.inception_extractor = InceptionExtractor(inception)    
        self.inception_connectors = InceptionConnectors(num_skip1,
                                                         num_skip2,
                                                         num_skip3,
                                                         num_skip4,
                                                         num_filters,
                                                         num_fmap)
        self.classifier = nn.Sequential(
            nn.Linear(num_skip1+num_skip2+num_skip3+num_skip4, 1024),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(1024, num_classes),
        )
    def forward(self, x):
        x,x2,x3,x4 = self.inception_extractor(x)
        out = self.inception_connectors(x,x2,x3,x4)
        out = self.classifier(out)          
        return out 

class InceptionConnectors(nn.Module):
    def __init__(self,
                num_skip1=256,
                num_skip2=256,
                num_skip3=256,
                num_skip4=256,
                num_filters=256,
                num_fmap=7):
        super(InceptionConnectors, self).__init__()

        self.inception_connector1 = InceptionConnector(num_skip=num_skip1,
                                                        num_filters=num_filters,
                                                        num_fmap=num_fmap)
        self.inception_connector2 = InceptionConnector(num_skip=num_skip2,
                                                        num_filters=num_filters,
                                                        num_fmap=num_fmap)        
        self.inception_connector3 = InceptionConnector(num_skip=num_skip3,
                                                        num_filters=num_filters,
                                                        num_fmap=num_fmap)
        self.inception_connector4 = InceptionConnector(num_skip=num_skip4,
                                                        num_filters=num_filters,
                                                        num_fmap=num_fmap)

    def forward(self,x,x2,x3,x4):
        x1_out = self.inception_connector1(x)
        x2_out = self.inception_connector2(x2)
        x3_out = self.inception_connector3(x3)
        x4_out = self.inception_connector4(x4)
        out = torch.cat((x1_out,x2_out,x3_out,x4_out), dim=1)
        return out

class InceptionConnector(nn.Module):
    def __init__(self,
                num_skip=256,
                num_filters=256,
                num_fmap=3):
        super(InceptionConnector, self).__init__()

        self.ae_block = nn.Sequential(
            nn.AvgPool2d(kernel_size=5, stride=3, padding=0),
            nn.Conv2d(384, num_filters, kernel_size=1, stride=1),
            nn.BatchNorm2d(num_filters),
            nn.ReLU(inplace=True)
        )
        self.fc = nn.Sequential(
            nn.Linear(num_filters*num_fmap*num_fmap, num_skip),
            nn.ReLU(inplace=True)
        )

    def forward(self, x):
        x = self.ae_block(x)
        x = x.view(x.size(0), -1)
        x = self.fc(x)
        return x 

class InceptionExtractor(nn.Module):
    def __init__(self,
                 inception):
        super(InceptionExtractor, self).__init__()
        self.stem = nn.Sequential(
            inception.features[0],
            inception.features[1],
            inception.features[2],
            inception.features[3],
            inception.features[4],
            inception.features[5],
        )

        self.inception1 = inception.features[6]
        self.inception2 = inception.features[7]
        self.inception3 = inception.features[8]
        self.inception4 = inception.features[9]

    def forward(self, x):
        x = self.stem(x)
        x = self.inception1(x)
        x2 = self.inception2(x)
        x3 = self.inception3(x2)
        x4 = self.inception4(x3)        
        return x,x2,x3,x4    

PS Do you have some kind of account for tips? I would be more than happy to donate a bit to provide further incentive to maintain and develop this repository.

aveysov avatar Dec 07 '17 07:12 aveysov

Also, this just may be due to the fact that my model is just binary classification, and for random data it is very confident in one class...

aveysov avatar Dec 07 '17 07:12 aveysov

Hello, @aveysov.

How are you using a IlgSimplifiedNormalized class? As I can see there is nn.Dropout,

        self.classifier = nn.Sequential(
            nn.Linear(num_skip1+num_skip2+num_skip3+num_skip4, 1024),
            nn.ReLU(True),
            nn.Dropout(p=0.5),
            nn.Linear(1024, num_classes),
        )

May be a high error rate is caused by nn.Dropout(p=0.5)? Do you call your_model.eval() before convertation?

gmalivenko avatar Dec 07 '17 08:12 gmalivenko

How are you using a IlgSimplifiedNormalized class?

Please see the snippet below. Inception4Cadene - is just the above inception4 original code. IlgSimplifiedEnsemble - is just the above model.

import keras  # work around segfault
import sys
import numpy as np
import torch.nn.functional as F

import torch
import torch.nn as nn
from torch.autograd import Variable

sys.path.append('../pytorch2keras')
from converter import pytorch_to_keras

# import my models the same way as in training loop 
from Inception4Cadene import inceptionv4
from IlgSimplifiedEnsemble import IlgSimplifiedNormalized

if __name__ == '__main__':
    max_error = 0
    for i in range(2):
        inception4 = inceptionv4(num_classes=1000, pretrained='imagenet')
        
        model = IlgSimplifiedNormalized(inception4,
                                        num_classes=2,
                                        num_skip1=256,
                                        num_skip2=256,
                                        num_skip3=256,
                                        num_skip4=512,
                                        num_filters=256,
                                        num_fmap=11)        
        
        saved_weights = '../some_folder/some_weights.pth.tar'
        
        print("=> loading checkpoint '{}'".format(saved_weights))
        checkpoint = torch.load(saved_weights)
        model.load_state_dict(checkpoint['state_dict'])
        print("=> loaded checkpoint (epoch {})"
              .format(checkpoint['epoch']))        
        
        model.eval()
        
        input_np = np.random.uniform(0, 1, (4, 3, 299, 299))
        input_var = Variable(torch.FloatTensor(input_np))
        output = model(input_var)

        k_model = pytorch_to_keras((3, 299, 299,), output)

        pytorch_output = output.data.numpy()
        keras_output = k_model.predict(input_np)

        error = np.max(pytorch_output - keras_output)
        print(error)
        if max_error < error:
            max_error = error

    print('Max error: {0}'.format(max_error))

Do you call your_model.eval() before convertation?

Yes, of course, I do, see the snippet above.

May be a high error rate is caused by nn.Dropout(p=0.5)?

Well, in this case - binary classification + random image that really may be the case, but afaik dropout is disabled when you use .eval().

snakers4 avatar Dec 07 '17 08:12 snakers4