terminal icon indicating copy to clipboard operation
terminal copied to clipboard

INTEGER_DIVIDE_BY_ZERO c0000094 OpenConsole.exe!Microsoft::Console::VirtualTerminal::AdaptDispatch::_DoLineFeed

Open zadjii-msft opened this issue 1 year ago • 18 comments

MSFT:44310564

PILES of hits in v1.18.2822.0. That's not great.

My notes internally say that I thought #15298 would fix this, but that's clearly wrong. ~12% of our hits in the last week, and a huge number more (more that 20%) now that 1.18 is rolling out more broadly

vast majority of stacks:

  • 100.00% OpenConsole.exe!Microsoft::Console::VirtualTerminal::AdaptDispatch::_DoLineFeed
    • 92.20% OpenConsole.exe!Microsoft::Console::VirtualTerminal::AdaptDispatch::LineFeed
      • 100.00% OpenConsole.exe!Microsoft::Console::VirtualTerminal::OutputStateMachineEngine::ActionExecute
        • 100.00% OpenConsole.exe!Microsoft::Console::VirtualTerminal::StateMachine::_SafeExecute_`Microsoft::Console::VirtualTerminal::StateMachine::_ActionExecute'::`2'::_lambda_1_ _
          • 98.94% OpenConsole.exe!Microsoft::Console::VirtualTerminal::StateMachine::ProcessString
            • 100.00% OpenConsole.exe!DoWriteConsole
              • 100.00% OpenConsole.exe!WriteConsoleWImplHelper

zadjii-msft avatar Oct 19 '23 14:10 zadjii-msft

Blame points at L96 specifically:

https://github.com/microsoft/terminal/blob/32b62207037a18fd7720e29e67df4fe46684ff2c/src/buffer/out/textBuffer.cpp#L93-L98

Presumably, the buffer is empty? dump wasn't clear.

(that's the 1.18 branch. 1.19 re-wrote a lot of this.)

zadjii-msft avatar Oct 19 '23 14:10 zadjii-msft

Amazingly enough, this definitely is the same failure stack as #15245. Same bucket entirely.

I mean, that makes sense I guess - I fixed something else and assumed that would make this crash impossible, but here we are, in this crash again.

zadjii-msft avatar Oct 19 '23 14:10 zadjii-msft

An empty buffer would be somewhat surprising: https://github.com/microsoft/terminal/blob/32b62207037a18fd7720e29e67df4fe46684ff2c/src/buffer/out/textBuffer.cpp#L42-L44

lhecker avatar Oct 19 '23 15:10 lhecker

Oh I see, if it's like the thing that #15298 intended to address, then we're looking at an already deallocated text buffer, due to which the size is 0.

lhecker avatar Oct 19 '23 15:10 lhecker

From the dump in #16204:

  • _textBuffer in DoWriteConsole is 0x000001fdb0a09220
  • but by the time we get up in AdaptDispatch::LineFeed, the textBuffer is 0x000001fdb0a41ba0.
    • Hmm, no, that can't be right....
  • In OutputStateMachineEngine::ActionExecute, this->_dispatch->_api->_io->_textBuffer is 0x000001fdb0a09220
  • Similarly, in AdaptDispatch::LineFeed, this->_api->_io->_textBuffer is 0x000001fdb0a09220...

is there some way that we changed the text buffer between the top of AdaptDispatch::_DoLineFeed (starting L2331) and the textBuffer.GetRowByOffset(newPosition.y).Reset(eraseAttributes); call on L2392? It'd have to be in the _api.SetViewportPosition({ viewport.left, viewport.top + 1 });, right?

oh weird cause AdaptDispatch::_DoLineFeed's textBuffer._size is 120x230. That's a yikes.

zadjii-msft avatar Oct 26 '23 15:10 zadjii-msft

@zadjii-msft I'm not sure if you were there, but in a previous meeting I speculated that it happens when we swap the alt with the main buffer, since that's code I recently touched with the text-reflow PR. Specifically: https://github.com/microsoft/terminal/commit/74748394c17c168843b511dd837268445e5dfd6c#diff-f629ab416d62d00e938348c017c456e3e7b180348a417b875e1a7b37329787f9 But I don't think that code is incorrect, so it might be another change we recently did.

lhecker avatar Oct 26 '23 16:10 lhecker

Is there anyway to workaround this crash? My users are complaining that my program constantly crash on their machines (due to this bug).

lone-wolf-akela avatar Nov 12 '23 08:11 lone-wolf-akela

@lone-wolf-akela I suspect if your users start the Terminal first, then run your app from an already running Terminal, this shouldn't affect them.

However, we haven't ever really gotten a repro for this incarnation of this bug. Could you share your app with us? If you've got a solid repro, it'll make confirming that we've fixed this all that much easier ☺️

zadjii-msft avatar Nov 13 '23 14:11 zadjii-msft

@lone-wolf-akela I suspect if your users start the Terminal first, then run your app from an already running Terminal, this shouldn't affect them.

However, we haven't ever really gotten a repro for this incarnation of this bug. Could you share your app with us? If you've got a solid repro, it'll make confirming that we've fixed this all that much easier ☺️

In our case running it inside Terminal at all caused this. We ended up telling our users to change the default terminal in Windows. Far as reproduction our game already has a program that grants full code access. I could talk to my boss about getting someone on the team access or I could email a built copy of the server which is what currently crashes Terminal.

thetestgame avatar Nov 14 '23 06:11 thetestgame

@lone-wolf-akela I suspect if your users start the Terminal first, then run your app from an already running Terminal, this shouldn't affect them.

However, we haven't ever really gotten a repro for this incarnation of this bug. Could you share your app with us? If you've got a solid repro, it'll make confirming that we've fixed this all that much easier ☺️

https://drive.google.com/file/d/17yniBAh4kM5oxVTOTwM-LAUy0RaUpCuL/view?usp=drive_link Here is our code together with a compiled exe. Windows terminal will crash at the show9CCN function in patcher.cpp

lone-wolf-akela avatar Nov 14 '23 12:11 lone-wolf-akela

Hi Mike, I e-mailed you an example that I hope will make repro easy. Thanks for looking at it!

dragongrace avatar Nov 14 '23 21:11 dragongrace

I can repro this at will with start Terminal with a Powershell tab and run get-childitem. Crash.

mkoehlerstb avatar Nov 14 '23 23:11 mkoehlerstb

Looks like this is the same thing reported over in #16212 and #16319, which is also seemingly fixed in 1.19. Weird.

There definitely aren't any crash hits for this in 1.19, so that's good at least.

@lhecker The sample repro in 16212 doesn't have any alt buffer hijinks either.

I'm not sure what minimal commit we could use to fix this, however.

(I'd bet this is also secretly #5094 but that's a can of worms to solve)

zadjii-msft avatar Nov 17 '23 11:11 zadjii-msft

notes:

  • Started from main @ 376737e54.
  • git bisect bad 32b6220 ->
    Bisecting: a merge base must be tested
    [63644507dac3ffebff0232d5d0bba30e164096b6] Prevent flickering in nushell due to FTCS marks (#14677)
    
  • that was bad
  • The merge base 63644507dac3ffebff0232d5d0bba30e164096b6 is bad.
    This means the bug has been fixed between 63644507dac3ffebff0232d5d0bba30e164096b6 and [376737e54afc301ced37d5416bc836eadbc611d2].
    
  • starting again from 376737e54afc301ced37d5416bc836eadbc611d2
  • 63644507dac3ffebff0232d5d0bba30e164096b6

zadjii-msft avatar Nov 20 '23 17:11 zadjii-msft

literally a day of bisecting later:

612b00cd445470fa85d2f71dc94930f03ebe4823 is the first new commit

As I suspected, that doesn't really look like something we can atomically service

zadjii-msft avatar Nov 21 '23 20:11 zadjii-msft

Well, I guess "it was fixed by rewriting the text buffer" isn't particularly comforting 😆

Maybe there's a quick patch we can throw at it?

DHowett avatar Nov 21 '23 22:11 DHowett

const ROW& TextBuffer::GetRowByOffset(const til::CoordType index) const noexcept
{
    // Rows are stored circularly, so the index you ask for is offset by the start position and mod the total of rows.
    const auto offsetIndex = gsl::narrow_cast<size_t>(_firstRow + index) % _storage.size();
    return til::at(_storage, offsetIndex);
}

vs.

ROW& TextBuffer::GetRowByOffset(const til::CoordType index)
{
    // Rows are stored circularly, so the index you ask for is offset by the start position and mod the total of rows.
    auto offset = (_firstRow + index) % _height;

    // Support negative wrap around. This way an index of -1 will
    // wrap to _rowCount-1 and make implementing scrolling easier.
    if (offset < 0)
    {
        offset += _height;
    }

    // We add 1 to the row offset, because row "0" is the one returned by GetScratchpadRow().
    return _getRowByOffsetDirect(gsl::narrow_cast<size_t>(offset) + 1);
}

Could it be that the index parameter is negative? 🤔 (In C/C++ % on negative numbers returns negative numbers.)

lhecker avatar Nov 21 '23 23:11 lhecker

You know, you might not be wrong about the idea there's a quick patch we could do here. My theory was always that we were operating on an old buffer, that's already been discarded (hence the weird height=0). But the above commit wouldn't really fix that necessarily, but it does change the way the math gets done. Maybe the discarded buffer thing isn't actually all that breaking, and the new math is resilient to that case.

zadjii-msft avatar Dec 04 '23 15:12 zadjii-msft

For folks following this thread: Sorry we weren't able to figure out a patch for the 1.18 stable releases. At this point, 1.19 moved to the stable release, with the re-written text buffer code that should avoid this crash entirely. If you're still hitting this, I'd recommend updating to 1.19 ☺️

zadjii-msft avatar Feb 08 '24 11:02 zadjii-msft