NPTEL2020-Indian-English-Speech-Dataset
NPTEL2020-Indian-English-Speech-Dataset copied to clipboard
Need ground truth for train, valid and test dataset transcript
I have downloaded the files using the download script. The problem is apart from pure dataset I can not find transcript ground truth for the audio files. Only audio files are present inside the zipped directory. Am I missing some instructions or steps?
Hi @kafan1986
Thanks for your interest in this dataset. If I remember correctly, the dataset folder would constitute 3 primary folders - wav, txt and metadata. wav folder will contain the audio clips and the txt folder would contain all the transcripts.
I think your downloaded data might be partially downloaded. Please check if every zip file is correctly downloaded and then extract them. Please check and let us know if you face issues then.
Hi @Prem-kumar27 Please can you check if currently you can download train data with the texts. It seems that train part does not have texts in it.