nanosnap
nanosnap copied to clipboard
errors
In order to compile with Visual Studio 2019 test_audio.cc lines 13 and 42 needed to be changed from size_t to uint64_t
Then, when running the test:
... [124] = 0.227935 [125] = 0.427108 [126] = 0.818015 [127] = 0.860731 input len = 128 nframes = 128 Assertion failed: fft_size == (nframes - 1), file D:\Glenn\Downloads\nanosnap-master\nanosnap-master\src\fft.cc, line 162
Assertion triggered is TODO
https://github.com/lighttransport/nanosnap/blob/371aad8390be249bc52f52669d9558d0cc0fd147/src/fft.cc#L161
So we need to implement fft_size != (nframes - 1)
case or disable corresponding unit test for a while.
@syoyo I think there might be another bug in the function stft()
. Here is how rfft()
is called:
bool ret = rfft(y_modulated.data(), /* frame len */nframes, /* num frames */frame_length, /* fft size */nframes, output);
It uses nframes
for both fft_size
and nframes
. Therefore, stft()
always fails because of the assertion.
Could you help me to fix this issue?
@amantalion Thanks for the info! The code is grabbed from librosa. Need to verify corresponding librosa python code.
https://librosa.org/doc/0.8.0/generated/librosa.stft.html
We may need to add implementation for fft_size != (nframes-1)
situation in rfft
Thank you for your quick reply, @syoyo . Do you know how to add the support for that by any chance? I don't understand the algorithm completely myself, but I get the general idea.
@amantalion You don't need to understand algorithm(I also didn't understand the code of numpy rfft completely)
What we did/you can do is
- Convert numpy.rfft source code to C++: https://numpy.org/doc/stable/reference/generated/numpy.fft.rfft.html
Then check if the C++ code is correct by comparing numpy's result.
This can be done by following:
- prepare test data using https://github.com/lighttransport/nanosnap/tree/master/tests/gen
- add and run unit test https://github.com/lighttransport/nanosnap/tree/master/tests
Then test if stft
work well.
@syoyo Thank you so much for your quick reply. I will follow the steps that you describe to fix this issue.
Also, the test succeeds if I delete the fft_size != (nframes-1)
assertion from rfft
. However, I notice that the reconstructed audio after STFT + ISTFT
has some buzzing/distortions (sounds like a person is wheezing if I try to run this on human speech) that I don't get with np.rfft
. I deleted the normalization by the sum of squares of the Hanning window and the distortions are still there.
Do you know by any chance what could cause distortions like that?
Thank you for all your pointers they are extremely helpful. Also, your library is the only one that works properly in C++ on the entire Github! So would like to thank you for all your efforts in implementing it.
@amantalion Hmm, so we need to verify the correctness of C++ stft
and istft
implementation. We may work on it at some point, but no guaranteed ETA. If you can contribute to implement and validate stft
, rfft
, we can set a bounty program(compensation will be ~1K USD or equivalent BTC or ETH)