ULCA-asr-dataset-corpus
ULCA-asr-dataset-corpus copied to clipboard
Need information on provenance of this data.
Can you add information at the top of the README to explain how this repo has the rights to distribute this data under the CC Attribution license? Much of the material appears to be broadcast journalism, which is usually copyrighted.
I am from Oracle, and we would very much like to use this data in building speech recognition for Indian languages, but we must first verify that the data is appropriately sourced and licensed. If you could add a statement to the top of the readme that explains how the data was gathered and how permissions were obtained to distribute it, that would be extremely helpful.
Basically, our team will not be able to do anything with this data unless I can convince our legal reviewers that Open-Speech-EkStep has the right to distribute this data under that license. Thank you.
Try https://gitter.im/Vakyansh/community?utm_source=share-link&utm_medium=link&utm_campaign=share-link# if you haven't received a response yet.