crnn-pytorch
crnn-pytorch copied to clipboard
Output indicates "PAD" char for all columns
Thanks for the Code!
I was using the model for a custom dataset (IAM Dataset for Hand written texts), and therefore wrote a custom Dataset class for train_loader which gives a 32 * 128 *3 (H * W * D) image and a string. I also used a custom function for doing one hot encoding for the labels.
Although I notice that the output from the CRNN model is always a tensor full of zeroes (the PAD character from my letter dictionary)
Please Help me out here! Thanks in advance!! `class dataset(Dataset):
def __init__(self, image_root, label_root, img_x, img_y):
"""Init function should not do any heavy lifting, but
must initialize how many items are available in this data set.
"""
self.images_path = image_root
self.labels_path = label_root
self.data_len = 0
self.images = []
self.labels = open(self.labels_path, "r").readlines()
self.transform = transforms.Compose([
transforms.Resize((img_x, img_y)),
transforms.ToTensor(),
transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
for root, dirs, files in os.walk(self.images_path):
for file in files:
if file.endswith('.png'):
self.data_len += 1
temp = file.split("-")
self.images.append(self.images_path + temp[0] + '/' + temp[0] + "-" + temp[1] + "/" + file)
def __len__(self):
"""return number of points in our dataset"""
return(self.data_len)
def __getitem__(self, idx):
""" Here we have to return the item requested by `idx`
The PyTorch DataLoader class will use this method to make an iterable for
our training or validation loop.
"""
img = self.images[idx]
label = self.labels[idx]
img = Image.open(img)
img = img.convert('RGB')
img = self.transform(img)
return(img, label[:-1])`
`def word_rep(word, letter2index, max_out_chars, device = 'cpu'):
rep = torch.zeros(max_out_chars).to(device)
if max_out_chars < len(word) + 1:
for i in range(max_out_chars):
pos = letter2index[word[i]]
rep[i] = pos
return(rep ,max_out_chars)
for letter_index, letter in enumerate(word):
pos = letter2index[letter]
rep[letter_index] = pos
pad_pos = letter2index[pad_char]
rep[letter_index+1] = pad_pos
return(rep, len(word))`
Hi, i have already writen a demo.py
for you to test and train.py
for you to train.
Hey @holmeyoung.. I am using the train.py you've written. Although, I wanted to train the model on a custom dataset. Therfore I did the above changes for the dataset class. It is working as expected but the training isn't giving fruitful results.
Hi, i know you are training it on a custom dataset, but you'd better make you data into lmdb and train with lmdb. My code is based on it. There is a create_dataset.py
in tool
folder.