pytorch_tiramisu
                                
                                 pytorch_tiramisu copied to clipboard
                                
                                    pytorch_tiramisu copied to clipboard
                            
                            
                            
                        Need help with adapting to different dataset
Hi @bfortuner ,
First of all thanks for the excellent implementation of the FCDenseNets.
I am trying to use your tiramisu implemetation for a different dataset and could really use your help. Particularly I need insight into how this is working
class LabelToLongTensor(object):
def __call__(self, pic):
    if isinstance(pic, np.ndarray):
        # handle numpy array
        label = torch.from_numpy(pic).long()
    else:
        label = torch.ByteTensor(torch.ByteStorage.from_buffer(pic.tobytes()))
        label = label.view(pic.size[1], pic.size[0], 1)
        label = label.transpose(0, 1).transpose(0, 2).squeeze().contiguous().long()
    return label
This is making a 1x224x224 label tensor for a label image of size 224x224x3. Now I am unable to adapt this for my dataset. I have 7 classes and each label image is 224x224x3. Should my label tensor be 1x224x224 with each value between 0-6 or 1-7 ? The nll_loss2d expects the output to be 7224224 if I am correct.
Regarding the model:
In the tiramisu.py model there is a parameter for number of classes. You can set this for your new N classes and it will create N channels in the final convolution. The model has a separate channel for each class and predicts depthwise softmax probabilities which you can train directly with cross entropy.
Regarding the labels:
You need to provide a H x W label image with long/int values between 0 and N-1. LabelToLongTensor converts an image/numpy array into a (H,W) pytorch tensor.
NLLoss can handle 2d targets out of the box, so no need to flatten. http://pytorch.org/docs/master/nn.html#nllloss
Thanks for the prompt response. In my dataset the labels I have are in the form of images. What would be the fastest way to encode them into HxW with values between 0-(N-1) (or into long tensors of the required form). I was doing it using numpy array functions but it was taking extremely long.
Here's what I was doing :
Urban = [0,255,255]
Agricultural = [255,255,0]
Range = [255,0,255]
Forest = [0,255,0]
Water = [0,0,255]
Barren = [255,255,255]
Unknown = [0,0,0]
label_colours = np.array([Urban, Agricultural, Range, Forest, Water,
                            Barren, Unknown])
image = Image.open(img)
data = np.asarray( image, dtype="int32" )
labels = data.copy()[:,:,0]
def index(image_rgb):
    	for idx, color in enumerate(label_colours):
	    	bool_map = (image_rgb == color).all()
	    	labels[bool_map] = idx
	    	
mp = np.apply_along_axis(index, 2, data)