voxceleb_enrichment_age_gender
voxceleb_enrichment_age_gender copied to clipboard
The final dataframe does not contain VoxCeleb1 data?
Thank you for creating this dataset. When inspecting final_dataframe_extended.csv, I found that no VoxCeleb ID starts with id1xx
, which indicates VoxCeleb1. All VoxCeleb ID in that file start with id0xx
, which is VoxCeleb2. A quick check at https://github.com/hechmik/voxceleb_enrichment_age_gender/blob/main/dataset/age-train.txt and https://github.com/hechmik/voxceleb_enrichment_age_gender/blob/main/dataset/age-test.txt also indicates the same thing.
Is there a bug in the code that remove VoxCeleb1 data? Or there is overlap between VoxCeleb1 and VoxCeleb2 speaker, so when merging the dataframes, VoxCeleb1 samples are dropped? Thank you.