SincNet Is the offset-by-one error on purpose?

Is the offset-by-one error on purpose?

Open qwfy opened this issue 5 years ago • 0 comments

According to this line https://github.com/mravanelli/SincNet/blob/d64244991324f96d77add11dc86939a7a81ae14d/compute_d_vector.py#L215

When wlen = 200 sample points and wshift = 10 sample points, (I'm aware that the 200 and the 10 refer to millisecond in the paper), with a audio signal of length 210 sample points, this would produce a tensor with its first dim being int((210 - 200) / 10) == 1, while this signal can produce two examples, with range [0, 200) being the first one, and range [10, 210) being the second one.

The compute_d_vector.py discards the second one, is this on purpose, or it's an offset-by-one error?

I'm asking this, because I observed that, the "paper version" has a slightly lower mean when comparing different audios in data_lists/TIMIT_test.scp using cosine similarity.

       "two examples version" "paper version ("one example")"
mean   0.74994516             0.7498444
std    0.08081242             0.080853514

Aug 29 '19 15:08 qwfy

SincNet SincNet copied to clipboard

Is the offset-by-one error on purpose?

SincNet
SincNet copied to clipboard