icefall icon indicating copy to clipboard operation
icefall copied to clipboard

Tedlium3 number of word pieces

Open AlexandderGorodetski opened this issue 2 years ago • 1 comments
trafficstars

Hello,

I see that in Tedlium3 recipe you use 500 word pieces. Did you try another numbers? (1000, 2000). Is there some influence on the decoding results?

Thanks, AlexG.

AlexandderGorodetski avatar Dec 11 '22 14:12 AlexandderGorodetski

We have tried 5000, 1000, and 500 on the librispeech dataset and found that 500 performs the best.

Tedlium3 uses the same config from librispeech. We have not tried other BPE vocab sizes on tedlium3 other than 500.

csukuangfj avatar Dec 12 '22 00:12 csukuangfj