caffe2 icon indicating copy to clipboard operation
caffe2 copied to clipboard

Problems with batch normalization

Open peterneher opened this issue 7 years ago • 11 comments

Hi everyone.

I have a problem with brew.spatial_bn during test time. If I train and test in the same session, everything is fine, but if simply try to load my pretrained model I get the following error:

RuntimeError: [enforce fail at operator.cc:26] blob != nullptr. op SpatialBN: Encountered a non-existing input blob: myBlobName

I load and save my models using these methods: https://github.com/peterneher/peters-stuff/blob/master/Caffe2Scripts/classification_no_db_example.py#L104-L130

The batch norm is created in the following way: myBlobName = brew.spatial_bn(m, brew.relu(m, brew.conv(m, 'data', 'conv_1_1', dim_in=1, dim_out=base_n_filters, kernel=kernel_size, pad=pad), 'nonl_1_1'), 'myBlobName', dim_in=base_n_filters, epsilon=1e-3, momentum=0.1, is_test=is_test)

Any ideas?

Cheers, Peter

peterneher avatar May 31 '17 15:05 peterneher

@peterneher I have the same problem, had you solve it ?

RiweiChen avatar Jul 05 '17 15:07 RiweiChen

Unfortunately not.

peterneher avatar Jul 08 '17 12:07 peterneher

@peterneher I reference your code, and facing same problem. Though I change loading method like https://caffe2.ai/docs/tutorial-loading-pre-trained-models.html, it still can work. Have you found the solution yet?

steven5401 avatar Aug 04 '17 03:08 steven5401

@peterneher I used brew.group_conv() to implement mobilenet and encountered same problem. When I tried to load the model exported to Android, the predictor didn't work.

RuntimeError: [enforce fail at operator.cc:30] blob != nullptr. op SpatialBN: Encountered a non-existing input blob: conv1_spatbn_rm

lhCheung1991 avatar Aug 11 '17 09:08 lhCheung1991

@lhCheung1991 please refer this post https://github.com/caffe2/caffe2/issues/884 I have found answer in last comment.

steven5401 avatar Aug 11 '17 09:08 steven5401

You should save extra parameter with _rm and _riv in init_net, predict_net in your save fuction. It is no need to change loading function.

steven5401 avatar Aug 11 '17 09:08 steven5401

@steven5401 I have tried it, it works! Thanks a lot!

It seems that Caffe2 don't treat the input blobs, which are *_riv and *_rm, of SpatialBN as part of parameters of model.params. It causes the problem.

Is it a bug?!

lhCheung1991 avatar Aug 11 '17 09:08 lhCheung1991

@steven5401 Hi, after I train the resnet50, I don't know how to save it as pb and use it, could you pls share your code as an example?

zhouyongxiu avatar Jan 09 '18 14:01 zhouyongxiu

According to issue884 I solved it by using the following code

from caffe2.python.predictor import mobile_exporter

def save_net(INIT_NET, PREDICT_NET, model) :
    extra_params = []
    extra_blobs = []
    for blob in workspace.Blobs():
        name = str(blob)
        if name.endswith("_rm") or name.endswith("_riv"):
            extra_params.append(name)
            extra_blobs.append(workspace.FetchBlob(name))
    for name, blob in zip(extra_params, extra_blobs):
        model.params.append(name)
 
    init_net, predict_net = mobile_exporter.Export(
        workspace, model.net, model.params
    )
     
    with open(PREDICT_NET, 'wb') as f:
        f.write(model.net._net.SerializeToString())
    with open(INIT_NET, 'wb') as f:
        f.write(init_net.SerializeToString())

If we don't have BN layer, just use

def save_net(INIT_NET, PREDICT_NET, model) :

    init_net, predict_net = mobile_exporter.Export(
        workspace, model.net, model.params
    )
    
    with open(PREDICT_NET, 'wb') as f:
        f.write(model.net._net.SerializeToString())
    with open(INIT_NET, 'wb') as f:
        f.write(init_net.SerializeToString())

BIGBALLON avatar Jun 27 '18 14:06 BIGBALLON

@steven5401 Hi, after I train the resnet50, I don't know how to save it as pb and use it, could you pls share your code as an example?

You may have a look at https://zhuanlan.zhihu.com/p/45067948 or https://github.com/CivilNet/Gemfield/blob/master/src/python/caffe2/resnet50_gemfield.py or

gemfield avatar Sep 29 '18 13:09 gemfield

Hi @BIGBALLON ,

is it possible to use the save_net method for the BN_layer case as a general case (network with AND without BN layer)? Are there any issues when using it and the network has no BN layer?

CarlosYeverino avatar Oct 26 '18 23:10 CarlosYeverino