q icon indicating copy to clipboard operation
q copied to clipboard

Floating point assert in zero_crossing::peak_pulse()

Open christofmuc opened this issue 5 years ago • 8 comments

Hi, loving your work! I gave the pitchDetection a spin, and ran into a most strange issue that might have nothing to do with your code but might be CPU floating point bug.

The debug version asserts after a few hours in the std::max() function, in a piece of code that shouldn't assert. I am using Visual Studio 2017.

This is what I see:

SourceCode asserting with "invalid comperator":

template<class _Ty>
_Post_equal_to_(_Left < _Right ? _Right : _Left)
_NODISCARD constexpr const _Ty& (max)(const _Ty& _Left, const _Ty& _Right)
_NOEXCEPT_COND(_NOEXCEPT_OPER(_Left < _Right))
{	// return larger of _Left and _Right
    if (_Left < _Right)
    {
        _STL_ASSERT(!(_Right < _Left), "invalid comparator");
        return (_Right);
    }
return (_Left);
}

Float values displayed in Debugger:

_Left	5.04023665e-05	const float &
_Right	0.000307894428	const float &

32 bit Int cast of them: (int)&_Left 944989985 int (int)&_Right 966880484 int

Cleary there is no reason for the CPU to calculate _Left is smaller than _Right and _Right is smaller than left.

Have you encountered any of this? I post it here because this is where it happened, but it looks more like a compiler or CPU bug. The assembler looks good, it does nothing else than

00007FF7D063F5A5  mov         rax,qword ptr [_Left]  
00007FF7D063F5AA  mov         rcx,qword ptr [_Right]  
00007FF7D063F5AF  movss       xmm0,dword ptr [rax]  
00007FF7D063F5B3  comiss      xmm0,dword ptr [rcx]  
00007FF7D063F5B6  ja          std::max<float>+3Ah (07FF7D063F5BAh)  
00007FF7D063F5B8  jmp         std::max<float>+0A5h (07FF7D063F625h)  

Where I'd say the comiss can't be that bad.

CPU: i9-9900K

I tried googling this but didn't succeed.

christofmuc avatar Oct 06 '19 12:10 christofmuc

Oh, for sake of completeness: Visual Studio 15.9.7

christofmuc avatar Oct 06 '19 12:10 christofmuc

Thanks!

Hrmmm... that is interesting. Is it repeatable? Do you have a self-contained example that exhibits this behavior? Do you have the complete function call trace?

Also, would you have a way to try this using other compilers (VS2019? g++? Clang?).

djowel avatar Oct 08 '19 00:10 djowel

It hasn't happened again, so it is certainly not repeatable. I tried with an isolated code block and the values above, but of course that works as expected. The function trace I had, it was pretty obvious in the inside loop calling predict_frequency(). I will keep monitoring and provide more information should it hit again.

christofmuc avatar Oct 15 '19 07:10 christofmuc

I fear that it might be something else. Could you outline the steps you did that led you to this? What code exactly were you running for hours on end? Maybe I can run some analytics such as google sanitizers ASAN and friends on that.

djowel avatar Oct 15 '19 21:10 djowel

It's the JammerNetz client I am working on: https://github.com/christofmuc/JammerNetz

The code is in the Tuner.cpp file, to get the whole app up and running is a bit more involved even if I tried to create a thorough readme. I am running a long time test right now, last time it ran about 3 hours (we do internet jam sessions with that piece of software, and I added your lib to be able to tune my analog synthesizers during the session ;-))

christofmuc avatar Oct 15 '19 22:10 christofmuc

Interesting. Is there a way for you to run some analytics such as google sanitizers ASAN and friends? Also, if you can provide even a partial call trace (i.e. which function is calling which until the call to max, esp. the code in the Q-lib), that would be helpful.

djowel avatar Oct 16 '19 08:10 djowel

I don't know the ASAN tools, and probably don't have time right now though I'll take a note of these.

The system survived a 12 hour stress test over night without that happening again, so it has been a very rare incident so far.

The call trace is: std::max() zero_crossing::peak_pulse() predict_period() predict_frequency() Tuner::TunerImpl::get_pitch()

Thank you very much for your interest!

christofmuc avatar Oct 16 '19 09:10 christofmuc

Let's keep this open. At least for the time being. Perhaps there will be some more info coming. The error is indeed quite odd. I haven't encountered such a thing.

djowel avatar Oct 16 '19 23:10 djowel

Can confirm this did not happen again even after hours and hours of piping audio through it in online jams over the last years.

This must have been a Neutrino hitting the FPU or I don't know what :-)

Thanks again for this wonderful library!

christofmuc avatar Jun 03 '23 11:06 christofmuc

Can confirm this did not happen again even after hours and hours of piping audio through it in online jams over the last years.

This must have been a Neutrino hitting the FPU or I don't know what :-)

Thanks again for this wonderful library!

Wonderful! Feel free to close this now.

djowel avatar Jun 03 '23 13:06 djowel

It's closed!

christofmuc avatar Jun 03 '23 13:06 christofmuc