Huahuan Zheng
Huahuan Zheng
In the Conformer original paper, the number of parameters are However, with the implementation in this repo, the number of parameters are slightly different ``` Conformer small: 10.16 M Conformer...
Please have a look at https://github.com/webdataset/webdataset/blob/682b30ee484d719a954554654d2d6baa213f9371/webdataset/compat.py#L96-L108 When input `urls` is string like `data-{000..123).tar`, it seems the wds just append both nodesplitter and workersplitter twice, which results the yield data is...
The **compact layout** in memory can be explained with this figure. The input to `rnnt_loss()` is of size `(N, T, U+1, V)=(3, 6, 6, V)` in a normal layout. Colored...
The warning messages occasionally thrown out during training, ``` ... WARNING: sample 10 [81, 25] has a forward/backward mismatch -0.000083 / -0.000083 ... WARNING: sample 11 [62, 28] has a...
Hi @kpu , I met a weird issue: training the n-gram model with relative small corpus was OK, but it raised baddiscount error with even more corpus 1. Training the...
### 🐛 Describe the bug Directly load `.mp3` audio with `torchaudio.sox_effects.apply_effects_file` will fail: ```python import torchaudio file = "clips/common_voice_id_25649986.mp3" effects = [['speed', '0.9'], ['rate', '48000']] torchaudio.sox_effects.apply_effects_file(file, effects) # output: #...
In current implementation, the warps along T axis are computed in fully serialized manner https://github.com/1ytic/warp-rnnt/blob/edd5857cd9abf29f12ab3fbc153f78f21191d80b/core.cu#L112-L134 The for loop of each warp is executed one-by-one, which means the ith warp at...
In current implementation, emissions and the predictions subtract their own maximum values respectively. But consider this case ``` emission[0, 0] = [0, -1000] prediction[0, 0] = [-1000, 0] -> #...
Say the original data is `A=[tensor([1., 2., 3.]), tensor([2.]), tensor([1., 2., 3., 4.])]`. For convenience of some other operations, I concat the data into `A_cat = tensor([1., 2., 3., 2.,...