equilid
equilid copied to clipboard
outputs can contain values more than the max size of rev_lang_vocab
When the output emits ids which are larger than the max size of the rev_lang_vocab, it throws an index error in this line.
https://github.com/davidjurgens/equilid/blob/master/equilid/equilid.py#L667
As a result, the predictions list is empty which leads to erroneous results.
On inspecting deeper, I find the second axis of output logits to be of size 40k. This means, the output can contain indices between 0 -39999 which will be a challenge when trying to map to the labels
Further inspection shows the default values of char vocab size and lang vocab size is 40k. Is that expected?
https://github.com/davidjurgens/equilid/blob/master/equilid/equilid.py#L81