Allow iterators for PGN parsing
I'm using this library as a helper for feeding data to a machine learning model. To that end, I need to be able to iterate over games in a PGN (as opposed to parsing in bulk).
In my code, I do something like this:
class StepParser : pgn::StreamParser<> {
public:
explicit StepParser(std::istream& stream) : StreamParser{ stream } {}
bool initRead(pgn::Visitor& vis) {
visitor = &vis;
if (!stream_buffer.fill()) {
return false;
}
return true;
}
void finishRead() {
if (!pgn_end) {
onEnd();
}
}
bool readNextGame() {
while (auto c = stream_buffer.some()) {
if (in_header) {
visitor->skipPgn(false);
if (*c == '[') {
visitor->startPgn();
pgn_end = false;
processHeader();
}
}
else if (in_body) {
processBody();
}
if (!dont_advance_after_body) {
stream_buffer.advance();
}
dont_advance_after_body = false;
if (pgn_end) {
pgn_end = false;
return true;
}
}
return false;
}
};
On the library side, I had to switch several members of StreamParser from private to protected.
This is all quite a bit of a hack, so I'm wondering if there is potential for official support to be able to iterate over games.
Sorry I don't quite understand what your problem is. A *.pgn file can consist of multiple individual pgn's, and this parser is able to parse all games in the *.pgn file. The startPgn will be called at the start of a new pgn inside the pgn file and the onEnd when it ends. A sample implementation is here https://github.com/official-stockfish/WDL_model/blob/master/scoreWDLstat.cpp, where we are running the parser over multiple pgn files, which themself consist of additional pgns.
Ah, sorry if I wasn't clear. I want a function I can call (as opposed to a callback) that will parse a single game from a PGN. In pseudocode, it would look something like:
parser = SetUpParser( pgnpath )
game1 = parser.parseNextGame()
game2 = parser.parseNextGame()
I see, I might add this but I’m not sure about returning a game object.
When speaking about this, if we can have something like "pgn.parseNextGame", then why not a std-style iterator like. "++pgn"?