ctcdecode
ctcdecode copied to clipboard
decode.cpp fatal error at vocab size assertion
Hello, I've been passing my arguments at decoder.decode as shown at test.py, with softmaxed probabilities with shape [batch_size, timesteps, classes] and the classes just like in the example. In my case I have 1232 classes including blank.
[ctcdecode/src/ctc_beam_search_decoder.cpp:32] FATAL: "(probs_seq[i].size()) == (vocabulary.size())" check failed. The shape of probs_seq does not match with the shape of the vocabulary
my Pytorch version is 1.1.0 and for C++ libraries I have gcc 5.
Did you solve this? Im having the exact same issue.
Did you solve this? Im having the exact same issue.
Did you solve this? I am having the exact same issue with the probably exact same dataset.
Did you solve this? Im having the exact same issue.
Any updates?
Workaround: you could comment out the check in /ctcdecode/src/ctc_beam_search_decoder.cpp
at 62-64 lines and reinstalling ctcdecode with python install .
.
I don't know the implications but since my vocab's size is 32 and the probs' size is 34 it's not a huge concern for me. I would suggest this temporary workaround if your circumstances are similar to mine.
Note: CTCDecode works fine with one or more models from Huggingface but not necessarily with all the models. I can't get it work with my custom wav2vec2 model, but it works fine with m3hrdadfi/wav2vec2-large-xlsr-turkish
.