ClariNet
ClariNet copied to clipboard
Do the synthesize inputs must be the .npy file
as i known, Clarinet is a end-to-end model(Text-to-Speech). But this model allows only the .npy file as the inputs. Can anyone use a sentence to synthesize ? what's more , i wonder the function of the Clarinet. can it realize the multi-speakers synthetise? or just make synthesize results better?
same question. The LJspeechDataset and DataLoader only load the .npy data as inputs, not use the sentence text.
same question here as well. Is there a way that I could use text as input?