bonito
bonito copied to clipboard
Model output shape does not make sense.
I'm trying to make predictions on new data, but the output of my model does not make any sense:
If I have a dummy model and data:
from bonito.util import load_symbol
import toml
import numpy as np
import torch
config_file= 'config/[email protected]'
configs = toml.load(config_file)
model = load_symbol(configs, "Model")(configs)
#inputs = np.load("inputs.npy")
inputs = torch.rand((50, 1, 5000))
output = model(inputs)
output.shape is torch.Size((1000, 50, 5120). The third dimension should 5, matching the label size for [email protected] and I'm not sure what is wrong.
Hi, as far as I understand it, the model doesn't return sequence, but scores that need to be decoded. This should work:
scores = model(inputs)
seqs = model.decode_batch(scores)
Note, you may need to put the inputs on the same device as the model ie. cuda:0
@lpryszcz is correct - ctc-crf
models (v3+) don't output a probability distribution over the alphabet.
More details: https://github.com/nanoporetech/bonito/issues/101#issuecomment-754611097