python-string-similarity
python-string-similarity copied to clipboard
Creating own dataset
Good night.
I want to create my own dataset with my own labels. is it possible for this repository?
Thanks
Hi, Yes, you can use your own dataset.
You need to extract features from raw data. To do this take a look to vggish lib here.
Also you can find how i did this on the fly.
Hope it will help.
Thanks!
But isn't necessary to train a model with youtube 8M?
Short answer - yes. Two models have been used here. vggish - as feature extractor. youtube8m - as classifier.
So if you want to use different labels you need to extract features using vggish and then train youtube 8m model with these features.
Hi,
I still have some questions about creating own data set.
The training script provided by youtube8m is using .tfrecord extension. Do you know how to generate this format for audio?
Also, how do I add my custom labels using the youtube8m training model? Many thanks.