voxceleb_enrichment_age_gender icon indicating copy to clipboard operation
voxceleb_enrichment_age_gender copied to clipboard

The final dataframe does not contain VoxCeleb1 data?

Open gau-nernst opened this issue 9 months ago • 0 comments

Thank you for creating this dataset. When inspecting final_dataframe_extended.csv, I found that no VoxCeleb ID starts with id1xx, which indicates VoxCeleb1. All VoxCeleb ID in that file start with id0xx, which is VoxCeleb2. A quick check at https://github.com/hechmik/voxceleb_enrichment_age_gender/blob/main/dataset/age-train.txt and https://github.com/hechmik/voxceleb_enrichment_age_gender/blob/main/dataset/age-test.txt also indicates the same thing.

Is there a bug in the code that remove VoxCeleb1 data? Or there is overlap between VoxCeleb1 and VoxCeleb2 speaker, so when merging the dataframes, VoxCeleb1 samples are dropped? Thank you.

gau-nernst avatar Sep 20 '23 04:09 gau-nernst