VALL-E-X
VALL-E-X copied to clipboard
Inference: Batch size > 1
It appears that batch inference is not currently supported. If the batch size is anything other than 1, inference fails
In models/vallex.py, inference()
:
assert y.shape[0] == 1, y.shape
This does not allow batch sizes other than 1 for audio prompts
xy_pos = torch.concat([x, y_pos], dim=1)
This assumes the batch size is the same for x and y_pos.
Here is the error you get if the batch size is 2 and best_of is 5
RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 10 but got size 5 for tensor number 1 in the list.
Please advise