SpeechTransProgress icon indicating copy to clipboard operation
SpeechTransProgress copied to clipboard

Tracking the progress in end-to-end speech translation

End-to-End Speech Translation Progress



Corpus Direction Target Duration License
CoVoST 2 {Fr, De, Es, Ca, It, Ru, Zh, Pt, Fa, Et, Mn, Nl, Tr, Ar, Sv, Lv, Sl, Ta, Ja, Id, Cy} -> En and En -> {De, Ca, Zh, Fa, Et, Mn, Tr, Ar, Sv, Lv, Sl, Ta, Ja, Id, Cy} Text 2880h CC0
CVSS {Fr, De, Es, Ca, It, Ru, Zh, Pt, Fa, Et, Mn, Nl, Tr, Ar, Sv, Lv, Sl, Ta, Ja, Id, Cy} -> En Text & Speech 1900h CC BY 4.0
mTEDx {Es, Fr, Pt, It, Ru, El} -> En, {Fr, Pt, It} -> Es, Es -> {Fr, It}, {Es,Fr} -> Pt Text 765h CC BY-NC-ND 4.0
CoVoST {Fr, De, Nl, Ru, Es, It, Tr, Fa, Sv, Mn, Zh} -> En Text 700h CC0
MUST-C & MUST-Cinema En -> {De, Es, Fr, It, Nl, Pt, Ro, Ru, Ar, Cs, Fa, Tr, Vi, Zh} Text 504h CC BY-NC-ND 4.0
How2 En -> Pt Text 300h Youtube & CC BY-SA 4.0
Augmented LibriSpeech En -> Fr Text 236h CC BY 4.0
Europarl-ST {En, Fr, De, Es, It, Pt, Pl, Ro, Nl} -> {En, Fr, De, Es, It, Pt, Pl, Ro, Nl} Text 280h CC BY-NC 4.0
Kosp2e Ko -> En Text 198h Mixed CC
Fisher + Callhome Es -> En Text 160h+20h LDC
MaSS parallel among En, Es, Eu, Fi, Fr, Hu, Ro and Ru Text & Speech 172h Bible.is
LibriVoxDeEn De -> En Text 110h CC BY-NC-SA 4.0
Prabhupadavani parallel among En, Fr, De, Gu, Hi, Hu, Id, It, Lv, Lt, Ne, Fa, Pl, Pt, Ru, Sl, Sk, Es, Se, Ta, Te, Tr, Bg, Hr, Da and Nl Text 94h
BSTC Zh -> En Text 68h
LibriS2S De <-> En Text & Speech 52h/57h CC BY-NC-SA 4.0












Changhan Wang ([email protected])