Codec-SUPERB icon indicating copy to clipboard operation
Codec-SUPERB copied to clipboard

Results for APCodec

Open redmist328 opened this issue 8 months ago • 2 comments

16 kHz 2kbps

parameter size:

encoder (including quantizer) : 29MB decoder: 40MB

exps/results.txt

Codec SUPERB application evaluation

Stage 1: Run speech emotion recognition. Acc: 74.93%

Stage 2: Run speaker related evaluation. Parsing the resyn_trial.txt for resyn wavs

Run speaker verification. EER: 3.02%

Stage 3: Run automatic speech recognition. WER: 4.74%

Stage 4: Run audio event classification. ACC: 55.25%

src/codec_metrics/exps/results.txt

Log results

File Name: crema_d.log Codec SUPERB objective metric evaluation on crema_d

Stage 1: Run SDR evaluation. SDR: mean score is: -2.618520825954788

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 0.64869004

Stage 3: Run STOI. stoi: mean score is: 0.717766808809779

Stage 4: Run PESQ. pesq: mean score is: 1.5509950947761535

File Name: esc50.log Codec SUPERB objective metric evaluation on esc50

Stage 1: Run SDR evaluation. SDR: mean score is: -9.309950038168095

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 2.002597

File Name: fluent_speech_commands.log Codec SUPERB objective metric evaluation on fluent_speech_commands

Stage 1: Run SDR evaluation. SDR: mean score is: 2.68255129531442

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 0.87451327

Stage 3: Run STOI. stoi: mean score is: 0.8740794643709145

Stage 4: Run PESQ. pesq: mean score is: 2.1911674320697783

File Name: fsd50k.log Codec SUPERB objective metric evaluation on fsd50k

Stage 1: Run SDR evaluation. SDR: mean score is: -6.6539098549604345

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 1.8435475

File Name: gunshot_triangulation.log Codec SUPERB objective metric evaluation on gunshot_triangulation

Stage 1: Run SDR evaluation. SDR: mean score is: -3.0264018525811536

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 1.4431057

File Name: libri2Mix_test.log Codec SUPERB objective metric evaluation on libri2Mix_test

Stage 1: Run SDR evaluation. SDR: mean score is: -1.3850498167169416

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 0.85650134

Stage 3: Run STOI. stoi: mean score is: 0.8534544293908012

Stage 4: Run PESQ. pesq: mean score is: 1.5768725705146789

File Name: librispeech.log Codec SUPERB objective metric evaluation on librispeech

Stage 1: Run SDR evaluation. SDR: mean score is: 2.5759249020219706

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 0.8179612

Stage 3: Run STOI. stoi: mean score is: 0.8975456227011622

Stage 4: Run PESQ. pesq: mean score is: 2.2901515591144563

File Name: quesst.log Codec SUPERB objective metric evaluation on quesst

Stage 1: Run SDR evaluation. SDR: mean score is: -1.3464429268284184

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 0.9656775

Stage 3: Run STOI. stoi: mean score is: 0.7968180258305204

Stage 4: Run PESQ. pesq: mean score is: 1.7317036986351013

File Name: snips_test_valid_subset.log Codec SUPERB objective metric evaluation on snips_test_valid_subset

Stage 1: Run SDR evaluation. SDR: mean score is: 4.364046016689939

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 0.8910932

Stage 3: Run STOI. stoi: mean score is: 0.9133034388476792

Stage 4: Run PESQ. pesq: mean score is: 2.245469583272934

File Name: voxceleb1.log Codec SUPERB objective metric evaluation on voxceleb1

Stage 1: Run SDR evaluation. SDR: mean score is: 1.5015711204024194

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 0.78175646

Stage 3: Run STOI. stoi: mean score is: 0.8577775240334691

Stage 4: Run PESQ. pesq: mean score is: 2.120602227449417

File Name: vox_lingua_top10.log Codec SUPERB objective metric evaluation on vox_lingua_top10

Stage 1: Run SDR evaluation. SDR: mean score is: -0.22438148479388495

Stage 2: Run Mel Spectrogram Loss. mel_loss: mean score is: 0.6584927

Stage 3: Run STOI. stoi: mean score is: 0.8339094697226409

Stage 4: Run PESQ. pesq: mean score is: 1.8127213382720948

Average SDR for speech datasets: 0.6937122850168396 Average Mel_Loss for speech datasets: 0.8118357137500001 Average STOI for speech datasets: 0.8430818479633708 Average PESQ for speech datasets: 1.9399604380130768 Average SDR for audio datasets: -6.330087248569893 Average Mel_Loss for audio datasets: 1.7630834000000002

redmist328 avatar Jun 19 '24 15:06 redmist328