STL icon indicating copy to clipboard operation
STL copied to clipboard

<fstream>: ifstream putback is still broken

Open fsb4000 opened this issue 5 years ago • 1 comments
trafficstars

Describe the bug The behaviour of std::basic_ifstream::putback is incorrect when putback is used over buffer boundaries. In particular, it appears that if one does a get() – peek() – putback() sequence, where the peek() causes a new buffer to be read from the file, the following putback() overwrites the first character in the buffer (i.e., the character that peek() returned) with the character you put back. Moreover, the pointers managing the buffer appear to be put into in an inconsistent state.

Command-line test case

C:\Temp>type main.cpp
#include <iostream>
#include <fstream>

int main()
{
  std::ofstream out("temp.bin", std::ios_base::binary);
  for (char i = 0; i < 16; i++)
  {
    out.put(i);
  }
  out.close();

  std::ifstream in("temp.bin", std::ios_base::binary);
  char buffer[8];
  in.rdbuf()->pubsetbuf(buffer, sizeof(buffer));

  for (char i = 0; i < 8; i++)
  {
    if (i != in.get())
    {
      std::cout << "incorrect character" << std::endl;
    }
  }
  std::cout << "pos: " << in.tellg() << std::endl;
  std::cout << "peek: " << in.peek() << std::endl;
  std::cout << "pos: " << in.tellg() << std::endl;
  in.putback(7);
  std::cout << "putback(7) " << std::endl;
  std::cout << "good?: " << in.good() << std::endl;
  std::cout << "pos: " << in.tellg() << std::endl;
  std::cout << "peek: " << in.peek() << std::endl;
  std::cout << "pos: " << in.tellg() << std::endl;
  std::cout << "get: " << in.get() << std::endl;
  std::cout << "pos: " << in.tellg() << std::endl;
  std::cout << "get: " << in.get() << std::endl;
  std::cout << "pos: " << in.tellg() << std::endl;
}
C:\Temp>cl /EHsc /W4 /WX main.cpp
Оптимизирующий компилятор Microsoft (R) C/C++ версии 19.28.29115 для x64
(C) Корпорация Майкрософт (Microsoft Corporation).  Все права защищены.

main.cpp
Microsoft (R) Incremental Linker Version 14.28.29115.0
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:main.exe
main.obj

C:\Temp>main.exe
pos: 8
peek: 8
pos: 8
putback(7)
good?: 1
pos: 14
peek: 8
pos: 6
get: 8
pos: 7
get: 9
pos: 8

Expected behavior if the stream is not able to achieve the putback is should report a failure. Valid outputs are therefore either:

pos: 8
peek: 8
pos: 8
putback(7) 
good?: 0
pos: -1
peek: -1
pos: -1
get: -1
pos: -1
get: -1
pos: -1

Or if putback succeeds:

pos: 8
peek: 8
pos: 8
putback(7) 
good?: 1
pos: 7
peek: 7
pos: 7
get: 7
pos: 8
get: 8
pos: 9

STL version Microsoft Visual Studio Community 2019 Preview Version 16.8.0 Preview 1.0

Additional context DevCom-858136 | DevCom-246367 Microsoft-internal: VSO-275515 / AB#275515 | VSO-425342 / AB#425342

fsb4000 avatar Aug 09 '20 09:08 fsb4000

@amyw-msft's analysis in Microsoft-internal VSO-275515, edited for clarity:

So there's three things here.

One is that the standard behavior for tellg() is actually to discard ungetc() changes. tellg() must call seekoff(0). seekoff must call fseek. fseek always discards any putback characters. The customer expects to be able to put back a character, then view the position change, which you can't do.

Two is peek actually already calls ungetc, so the second ungetc fails and uses the filebuf putback buffer instead. Our seekoff call is a bit nonstandard as we call fseek(-1) in order to simulate removing a character that we put in the putback buffer, but this is inconsistent with how the C putback buffer works - the putback operation should instead just be discarded (this is fixed in WCFB02).

Three is that the putback machinery is very wrong. Instead of no longer pointing inside the file and pointing to the putback buffer, instead _Set_back changes the internal pointers in the FILE * by pointing those away from its internal data and to the putback buffer. When we call tellg(), this calls fgetpos before we restore the internal FILE * structure to its old state. Since the FILE *'s "remaining count" should be 8 but is instead 1, it thinks we are 7 characters further in the FILE than we are and reports 14. I believe we can fix this by calling _Reset_back prior to the fgetpos call. Looks like this wasn't fixed in WCFB02, but I agree we should just scrap it - it's really complicated and doesn't provide a benefit.

StephanTLavavej avatar Aug 10 '20 22:08 StephanTLavavej