libStatGen
libStatGen copied to clipboard
Incorrect behavior with ifeof() when file length matches default buffer size
I believe this library is deprecated, but wanted to leave this info for future reference in case someone else runs into this problem.
It seems as though ifeof() does not return true when the last line in the file fills the buffer exactly (to length DEFAULT_BUFFER_SIZE).
If you read through a file using String::readLine():
https://github.com/statgen/libStatGen/blob/fae4fca874b3b78bf9b61c0eae080c15edd976a4/general/StringBasics.cpp#L744-L776
It calls ifgetc() repeatedly:
https://github.com/statgen/libStatGen/blob/fae4fca874b3b78bf9b61c0eae080c15edd976a4/general/InputFile.h#L324-L341
When the last line fills the buffer exactly, myBufferIndex == myCurrentBufferSize == DEFAULT_BUFFER_SIZE
.
But when ifeof()
checks to see if we have arrived at EOF:
https://github.com/statgen/libStatGen/blob/fae4fca874b3b78bf9b61c0eae080c15edd976a4/general/InputFile.h#L386-L404
It checks if myBufferIndex < myCurrentBufferSize
, which will be false, and it thinks the file should continue being read from.
This can result in a seg fault depending on what you try to do with the buffer at that point, after you've tried to read another line (that doesn't exist) from the file.
Code sample:
#include "InputFile.h"
#include "StringBasics.h"
#include "StringArray.h"
#include <iostream>
#include <string>
int main(int argc, char *argv[]) {
std::string bad_filepath = argv[1];
IFILE test = ifopen(bad_filepath.c_str(), "r");
// This loop is roughly what RAREMETAL does when reading in a covariance file
while (!ifeof(test)) {
String buffer;
buffer.ReadLine(test);
StringArray tokens;
tokens.AddTokens(buffer, "\t ");
int p = tokens[0].Find("chr");
}
std::cout << "Finished";
return 0;
}
Attached file, run with above code, will cause a seg fault.
The simple solution to this (if you're a user looking to get around this bug) is just add a zero somewhere it doesn't matter at the end of the file, or an extra space, and the file will be read successfully.