bonito icon indicating copy to clipboard operation
bonito copied to clipboard

question about 'bonito train'

Open Flower9618 opened this issue 3 years ago • 4 comments

Hello, I am trying to train my model using my own data. But there is an error "RuntimeError: target_lengths must be of size batch_size". Attached is the screenshot of the Error. Thank you very much! Inkederror_LI

Flower9618 avatar Sep 07 '21 08:09 Flower9618

Can you load your training data and post the shapes?

>>> import os
>>> import numpy as np
>>> directory = "/data/training"
>>> chunks = np.load(os.path.join(directory, "chunks.npy"))
>>> targets = np.load(os.path.join(directory, "references.npy"))
>>> lengths = np.load(os.path.join(directory, "reference_lengths.npy"))
>>> chunks.shape, targets.shape, lengths.shape
((1000000, 3600), (1000000, 480), (1000000,))

iiSeymour avatar Sep 08 '21 19:09 iiSeymour

Can you load your training data and post the shapes?

>>> import os
>>> import numpy as np
>>> directory = "/data/training"
>>> chunks = np.load(os.path.join(directory, "chunks.npy"))
>>> targets = np.load(os.path.join(directory, "references.npy"))
>>> lengths = np.load(os.path.join(directory, "reference_lengths.npy"))
>>> chunks.shape, targets.shape, lengths.shape
((1000000, 3600), (1000000, 480), (1000000,))

Thank you very much for your reply. I just set the batch size=4 and use '--multi-gpu' to show my error. Here is the information about the data uploaded by train_loader when training the model. By the way, I would like to know if the 'chunksize' will have an effect to the training. error

Flower9618 avatar Sep 09 '21 00:09 Flower9618

Can you try without --multi-gpu? Yes, chunksize is important for training and 100 samples is too short

iiSeymour avatar Sep 09 '21 09:09 iiSeymour

Can you try without --multi-gpu? Yes, chunksize is important for training and 100 samples is too short

Thank you very much. When I try without '--multi-gpu', it works. Why can not use multiple gpu?

Flower9618 avatar Sep 09 '21 09:09 Flower9618