FCIS
FCIS copied to clipboard
What does the mask label look like?
From the following code, we can compute the mask loss, but I cannot understand what does the mask label look like, I've read through the code for loading data, I found that the mask label are just binary masks, which contains 0 and 1 only.... So, would you please help me make it clear?
class FCISMaskLossMetric(mx.metric.EvalMetric):
def __init__(self, cfg):
super(FCISMaskLossMetric, self).__init__('FCISMaskLoss')
self.e2e = cfg.TRAIN.END2END
self.pred, self.label = get_rcnn_names(cfg)
self.cfg = cfg
def update(self, labels, preds):
mask_loss = preds[self.pred.index('fcis_mask_loss')]
if self.e2e:
label = preds[self.pred.index('fcis_mask_label')]
else:
raise NotImplementedError
mask_size = mask_loss.shape[2]
label = label.asnumpy().astype('int32').reshape((-1))
mask_loss = mx.nd.transpose(mask_loss.reshape((mask_loss.shape[0], mask_loss.shape[1], mask_size * mask_size)), axes=(0, 2, 1))
mask_loss = mask_loss.reshape((label.shape[0], 2))
mask_loss = mask_loss.asnumpy()
keep_inds = np.where(label != -1)[0]
label = label[keep_inds]
cls = mask_loss[keep_inds, label]
cls += 1e-14
cls_loss = -1 * np.log(cls)
cls_loss = np.sum(cls_loss)
self.sum_metric += cls_loss
self.num_inst += len(keep_inds)
I think there is a threshold from which you could obtain a binary mask and then you can be able to calculate the loss..
you can read the code in ./lib/dataset/pascal_voc.py where the images & masks are loaded, splitted by class/instance, stored in .hkl as cache and then an index database (sdsdb) are created.