Synthetic-Voice-Detection-Vocoder-Artifacts icon indicating copy to clipboard operation
Synthetic-Voice-Detection-Vocoder-Artifacts copied to clipboard

Clarification Needed on Intra-dataset vs Cross-dataset Evaluation Metrics in Paper

Open chandlerbing65nm opened this issue 2 years ago • 1 comments
trafficstars

I have some questions regarding the evaluation metrics and results presented in Sections 4.4 and 4.5.

Intra-dataset Evaluation (Section 4.4)

The paper reports a very low EER of 0.19% on the WaveFake dataset using the RawNet2 model.

  • To confirm my understanding, was this evaluation performed with the model being trained and tested on the same WaveFake dataset?

image

Cross-dataset Evaluation (Section 4.5)

On the other hand, the EER significantly increased to 26.95% when the model trained on the LibriSeVoc dataset was tested on the WaveFake dataset. This suggests poor generalization to unseen data.

  • Are there any ongoing efforts to improve this aspect of the model, perhaps through domain adaptation techniques or exposure to a more diverse set of vocoders during training?

image

chandlerbing65nm avatar Oct 06 '23 03:10 chandlerbing65nm

Hi Chandler, Thank you very much for the question. For your first question, we actually have split the train and test datasets on WakeFake Dateset. For your second question, that is an excellent idea. We are currently working on it, and like your ideas, we are trying to expose to a more diverse set of vocoders during training.

csun22 avatar Oct 06 '23 04:10 csun22