QuartzNet-ASR-pytorch
QuartzNet-ASR-pytorch copied to clipboard
Why using separable conv in C1 and C2 instead of normal conv1d?
Thank you for sharing your great work.
I noticed you have used sepconv_bn in C1 and C2 instead of conv_bn_act.
Is it on purpose? Does it give better results?
https://github.com/Kirili4ik/QuartzNet-ASR-pytorch/blob/ec6073ef76d1ce0419bc62065ec746cb12a63efc/model.py#L49
Hi, Separable convolutions is a trick described in the paper of QuartzNet. Shortly, it uses less parameters achieving pretty same results (so it makes the model smaller and faster for on-device inference)
I see. 👍 I thought they just used separable conv in B blocks. Thanks for fast reply.
As far as I remember, it can be unclear in the paper about the blocks where sepconvs are used. But we have tried to fully reproduce the paper and the number of the parameters of the model is known. If I remember correctly we tried using sepconvs everywhere to get the same number of the parameters as described in the paper and it worked.