crnn.pytorch icon indicating copy to clipboard operation
crnn.pytorch copied to clipboard

why self.alphabet = alphabet + '-' ?

Open gbolin opened this issue 5 years ago • 3 comments

hi friend, I have a doubt in the file crnn.pytorch/utils.py + 25 why do you code like this ?

self.alphabet = alphabet + '-' # for -1 index I really could not understand why add '-' at the tail of alphabet. looking forward to hearing from you

gbolin avatar May 21 '19 07:05 gbolin

@GitHubGS In text recognition,we use '-' to represent chars not in our alphabet,just like in object detection task,we ues num +1 class ,in which 1 represent background....

IEEE-FELLOW avatar May 28 '19 04:05 IEEE-FELLOW

@IEEE-FELLOW but, the self.alphabet only used in function decode, `
if raw: return ''.join([self.alphabet[i - 1] for i in t]) else: char_list = [] for i in range(length): if t[i] != 0 and (not (i > 0 and t[i - 1] == t[i])): char_list.append(self.alphabet[t[i] - 1])

`

I wonder that what is the max value of t[i] - 1? bcz I think the max value of (t[i]-1) is len(alphabet) because t[i] max value is len(alphabet) + 1.

gbolin avatar May 28 '19 05:05 gbolin

@GitHubGS When t[i] == 0, self.alphabet[0-1] is the last one in alphabet. It means a separator between characters. Actually the added '-' only make sense in decode() and it wont appear in training.

tabsun avatar Oct 17 '19 04:10 tabsun