pgn-parser icon indicating copy to clipboard operation
pgn-parser copied to clipboard

Parse fails on pgn from chess.com that includes computer analysis due to ±, ∓, and = characters

Open tneisinger opened this issue 4 years ago • 4 comments

The chess.com analysis tool puts ±, ∓, and = characters into its pgn text when computer analysis is included. These characters are not contained within move comments. This causes parsing to fail.

Below is example pgn text from chess.com. Note the ± character in move 8:

1. e4 d5 2. exd5 Qxd5 3. Nc3 Qa5 4. Nf3 Nf6 5. Bc4 e6 6. d3 Bb4 7. Bd2 O-O 8. a3
Bd6? ± {MISTAKE (+3.05) Critical mistake.} ({(+1.06) The best move was} 8...
Bxc3 9. Bxc3 Qf5 *

I'm not sure if this usage of the ±, ∓, and = characters is normal in pgn, but I think it would be good for this package to be able to handle all chess.com pgn strings.

I'm happy to work on a pull request to fix this, but I'm not sure how those characters should be handled. Any opinions on that? Maybe the character should be prepended to the comment that follows it?

tneisinger avatar Jul 02 '21 18:07 tneisinger

Hi, those are typically represented in PGN as "numeric annotation glyphs", eg $1 means "strong move" and a PGN rendering program will typically output "!". There are equivalent NAGs for +/= etc. Im ok with supporting but my understanding is that is not valid PGN.

kevinludwig avatar Jul 02 '21 22:07 kevinludwig

Thanks for the details!

It turns out that there is another reason that chess.com PGNs won't parse. When computer analysis is included in the PGN text, Chess.com will put comments at the beginning of variations. Note the second line here:

1. f3 e6 2. g4?? ∓ {BLUNDER (♚ Mate in 1)}
({(-0.44) The best move was} 2. c3 d5) 2... Qh4# 0-1

Maybe each variation should also have a list of comments? Then any comments discovered at the beginning of a varation could be put in that variation's comments list.

To address both problems, my thought is that the above pgn text could parse to something this:

{
    comments_above_header: null,
    headers: null,
    comments: null,
    result: "0-1",
    moves: [
        {
            move_number: 1,
            move: "f3",
            comments:[]
        },
        {
            move: "e6",
            comments:[]
        },
        {
            move_number: 2,
            move: "g4??",

            // PROPOSED SUPPORT FOR NUMERIC ANNOTATION GLYPHS
            nag: {symbol: '∓', number: 17},

            comments: [{text: "BLUNDER (♚ Mate in 1)"}],
            ravs: [
                {
                    // PROPOSED SUPPORT FOR COMMENTS FOUND AT THE BEGINNING OF VARIATIONS
                    comments: [{text: "(-0.44) The best move was"}],

                    result: null,
                    moves: [
                        {
                            move_number: 2,
                            move: "c3",
                            comments: []
                        },
                        {
                            move: "d5",
                            comments: []
                        }
                    ]
                }
            ],
        },
        {
            move_number: 2,
            move: "Qh4#",
            comments:[]
        }
    ]
}

I am using this table of NAG values as reference.

Does this approach look right?

tneisinger avatar Jul 03 '21 22:07 tneisinger

My mistake. I should have looked at the code for this project more closely before making the suggestion above. I see that $[0-9] NAGs are already supported.

In light of that, my revised idea for parsing the move 2. g4?? ∓ ∞ $4 would be:

{
    move_number: 2,
    move: "g4??",
    nags: [
        "∓",
        "∞",
        "$4",
    ]
}

tneisinger avatar Jul 04 '21 03:07 tneisinger

I will take a look and see what i can do. Please note ill be out on holiday so i wont get to this for about a week.

kevinludwig avatar Jul 04 '21 05:07 kevinludwig