crnn.pytorch
crnn.pytorch copied to clipboard
why self.alphabet = alphabet + '-' ?
hi friend, I have a doubt in the file crnn.pytorch/utils.py + 25 why do you code like this ?
self.alphabet = alphabet + '-' # for
-1
index I really could not understand why add '-' at the tail of alphabet. looking forward to hearing from you
@GitHubGS In text recognition,we use '-' to represent chars not in our alphabet,just like in object detection task,we ues num +1 class ,in which 1 represent background....
@IEEE-FELLOW but, the self.alphabet only used in function decode,
`
if raw:
return ''.join([self.alphabet[i - 1] for i in t])
else:
char_list = []
for i in range(length):
if t[i] != 0 and (not (i > 0 and t[i - 1] == t[i])):
char_list.append(self.alphabet[t[i] - 1])
`
I wonder that what is the max value of t[i] - 1? bcz I think the max value of (t[i]-1) is len(alphabet) because t[i] max value is len(alphabet) + 1.
@GitHubGS When t[i] == 0, self.alphabet[0-1] is the last one in alphabet. It means a separator between characters. Actually the added '-' only make sense in decode() and it wont appear in training.