nanosnap icon indicating copy to clipboard operation
nanosnap copied to clipboard

errors

Open dts350z opened this issue 4 years ago • 8 comments

In order to compile with Visual Studio 2019 test_audio.cc lines 13 and 42 needed to be changed from size_t to uint64_t

Then, when running the test:

... [124] = 0.227935 [125] = 0.427108 [126] = 0.818015 [127] = 0.860731 input len = 128 nframes = 128 Assertion failed: fft_size == (nframes - 1), file D:\Glenn\Downloads\nanosnap-master\nanosnap-master\src\fft.cc, line 162

dts350z avatar Oct 11 '20 19:10 dts350z

Assertion triggered is TODO

https://github.com/lighttransport/nanosnap/blob/371aad8390be249bc52f52669d9558d0cc0fd147/src/fft.cc#L161

So we need to implement fft_size != (nframes - 1) case or disable corresponding unit test for a while.

syoyo avatar Oct 12 '20 06:10 syoyo

@syoyo I think there might be another bug in the function stft(). Here is how rfft() is called:

bool ret = rfft(y_modulated.data(), /* frame len */nframes, /* num frames */frame_length, /* fft size */nframes, output);

It uses nframes for both fft_size and nframes. Therefore, stft() always fails because of the assertion.

Could you help me to fix this issue?

amantalion avatar Mar 03 '21 22:03 amantalion

@amantalion Thanks for the info! The code is grabbed from librosa. Need to verify corresponding librosa python code.

https://librosa.org/doc/0.8.0/generated/librosa.stft.html

We may need to add implementation for fft_size != (nframes-1) situation in rfft

syoyo avatar Mar 04 '21 06:03 syoyo

Thank you for your quick reply, @syoyo . Do you know how to add the support for that by any chance? I don't understand the algorithm completely myself, but I get the general idea.

amantalion avatar Mar 07 '21 03:03 amantalion

@amantalion You don't need to understand algorithm(I also didn't understand the code of numpy rfft completely)

What we did/you can do is

  • Convert numpy.rfft source code to C++: https://numpy.org/doc/stable/reference/generated/numpy.fft.rfft.html

Then check if the C++ code is correct by comparing numpy's result.

This can be done by following:

  • prepare test data using https://github.com/lighttransport/nanosnap/tree/master/tests/gen
  • add and run unit test https://github.com/lighttransport/nanosnap/tree/master/tests

Then test if stft work well.

syoyo avatar Mar 07 '21 06:03 syoyo

@syoyo Thank you so much for your quick reply. I will follow the steps that you describe to fix this issue.

amantalion avatar Mar 07 '21 23:03 amantalion

Also, the test succeeds if I delete the fft_size != (nframes-1) assertion from rfft. However, I notice that the reconstructed audio after STFT + ISTFT has some buzzing/distortions (sounds like a person is wheezing if I try to run this on human speech) that I don't get with np.rfft. I deleted the normalization by the sum of squares of the Hanning window and the distortions are still there.

Do you know by any chance what could cause distortions like that?

Thank you for all your pointers they are extremely helpful. Also, your library is the only one that works properly in C++ on the entire Github! So would like to thank you for all your efforts in implementing it.

amantalion avatar Mar 07 '21 23:03 amantalion

@amantalion Hmm, so we need to verify the correctness of C++ stft and istft implementation. We may work on it at some point, but no guaranteed ETA. If you can contribute to implement and validate stft, rfft, we can set a bounty program(compensation will be ~1K USD or equivalent BTC or ETH)

syoyo avatar Mar 08 '21 06:03 syoyo