caffe2 icon indicating copy to clipboard operation
caffe2 copied to clipboard

euclidean loss caffe2 python

Open oujieww opened this issue 6 years ago • 3 comments

i can not find this function. so i use Sub() Abs() and DotProduct() to get loss for N-D diff=model.net.Sub([blob_out, 'label'],'diff') adiff=model.net.Abs([diff],'adiff') loss = model.net.DotProduct([adiff,adiff],'loss') but it do not work,Can anyone help me?

oujieww avatar Mar 07 '18 07:03 oujieww

If I understand you correctly, the following function should do the job:

https://caffe2.ai/docs/operators-catalogue.html#squaredl2distance

Further, the following Tutorial might help:

https://github.com/caffe2/caffe2/blob/master/caffe2/python/tutorials/Toy_Regression.ipynb

I don't know what your blob_out and 'label' is or which error message you got for your code. If you want to use your method, a little bit more explanation whould be necessary.

SteveCruz avatar Mar 07 '18 17:03 SteveCruz

@SteveCruz thanks for your imformation. I change my code to : model.net.Sub([blob_out, 'label'],'diff') model.net.Abs(['diff'],'adiff') model.net.DotProduct(['adiff','adiff'],'dist') model.net.AveragedLoss(['dist'],'final_loss') and it work.but i get very big loss . if loss still so big , i want to try squaredl2distance

I think it should be better than myself code. but in 'docs' ,squaredl2distance meanning (X-Y)^2/2 ,i have seen its code and think ,it is same as Dot((X-Y),(X-Y)).

thank you very much!!!

oujieww avatar Mar 08 '18 08:03 oujieww

Hello @SteveCruz @oujieww ,

I would like to use the euclidean loss as the output of my CNN model. I am using the brew.db_input to feed the input layer. Should I proceed in the following way?

def add_input(self, model, batch_size, db, db_type, device_opts):
        with core.DeviceScope(device_opts):
            # load the data
            data_uint8, label = brew.db_input(
                model,
                blobs_out=["data_uint8", "label"],
                batch_size=batch_size,
                db=db,
                db_type=db_type,
            )
            # cast the data to float
            data = model.Cast(data_uint8, "data", to=core.DataType.FLOAT)

            # scale data from [0,255] down to [0,1]
            data = model.Scale(data, data, scale=float(1./256))

            # don't need the gradient for the backward pass
            data = model.StopGradient(data, data)

            dataset_size = int (lmdb.open(db).stat()['entries'])

            return data, label, dataset_size



data, label, train_dataset_size = self.add_input(train_model, batch_size=batch_size, db=os.path.join(self._data_dir_, 'train-nchw-lmdb'), db_type='lmdb', device_opts=device_opts)

predictions = self.create_model(train_model, data, label, device_opts=device_opts)

def create_model(self, model, data, label, device_opts):
    	with core.DeviceScope(device_opts):

    		data = data
      		conv1_ = brew.conv(model, data, 'conv1_', dim_in=3, dim_out=96, kernel=11, stride=4)
    		relu1_ = brew.relu(model, conv1_, conv1_)
    		pool1_ = brew.max_pool(model, relu1_, 'pool1_', kernel=3, stride=2)
      		conv2_ = brew.conv(model, pool1_, 'conv2_', dim_in=96, dim_out=256, kernel=5, stride=4)
    		relu2_ = brew.relu(model, conv2_, conv2_)
    		pool2_ = brew.max_pool(model, relu2_, 'pool2_', kernel=3, stride=2)
      		conv3_ = brew.conv(model, pool2_, 'conv3_', dim_in=256, dim_out=384, kernel=3, stride=1)
    		relu3_ = brew.relu(model, conv3_, conv3_)
      		conv4_ = brew.conv(model, relu3_, 'conv4_', dim_in=384, dim_out=384, kernel=3, stride=1)
    		relu4_ = brew.relu(model, conv4_, conv4_)
      		conv5_ = brew.conv(model, relu4_, 'conv5_', dim_in=384, dim_out=256, kernel=3, stride=1)
    		relu5_ = brew.relu(model, conv5_, conv5_)
    		pool5_ = brew.max_pool(model, relu5_, 'pool5_', kernel=3, stride=2)
    		fc5_ = brew.fc(model, pool5_, 'fc5_', dim_in=256 * 2 * 3, dim_out=4096)
    		relu6_ = brew.relu(model, fc5_, fc5_)
		dropout6_ = brew.dropout(model, relu6_, 'dropout6_', ratio=0.5, is_test=False)
    		fc6_ = brew.fc(model, dropout6_, 'fc6_', dim_in=4096, dim_out=4096)
    		relu7_ = brew.relu(model, fc6_, fc6_)
		dropout7_ = brew.dropout(model, relu7_, 'dropout7_', ratio=0.5, is_test=False)
    		fc7_ = brew.fc(model, dropout7_, 'fc7_', dim_in=4096, dim_out=256)
    		relu8_ = brew.relu(model, fc7_, fc7_)
		dropout8_ = brew.dropout(model, relu8_, 'dropout8_', ratio=0.5, is_test=False)
    		relu9_ = brew.relu(model, dropout8_, dropout8_)
    		fc9_ = brew.fc(model, relu9_, 'fc9_', dim_in=256, dim_out=14)
    		
    		dist = model.net.SquaredL2Distance([label, fc9_], 'dist')    
    		predictions = dist.AveragedLoss([], ['predictions'])

    		return predictions

CarlosYeverino avatar Nov 22 '18 23:11 CarlosYeverino