ctpg icon indicating copy to clipboard operation
ctpg copied to clipboard

Fix: iterator is extended over end of string if no match occurs with `dfa_match`

Open moritz-geier opened this issue 1 year ago • 3 comments

Hello,

I got an error on Windows 11 and MSVC 19.39 where the parser fails. This happens if the dfa_match returns uninitialized16 and we try to add it to the iterator. Thus here my proposed fix.

To test the behavior create the following parser

    ///////////////////////////////////////////////////////////////////////////
    static constexpr ctpg::nterm<bool> conditionExpression{"condition"};
    static constexpr ctpg::char_term trueLiteral{'t'};
    static constexpr ctpg::char_term falseLiteral{'f'};

    ///////////////////////////////////////////////////////////////////////////
    static constexpr auto root = conditionExpression;

    ///////////////////////////////////////////////////////////////////////////
    static constexpr auto terminals = ctpg::terms(
        trueLiteral,
        falseLiteral
    );

    ///////////////////////////////////////////////////////////////////////////
    static constexpr auto nonTerminals = ctpg::nterms(
        conditionExpression
    );

    ///////////////////////////////////////////////////////////////////////////
    static constexpr ctpg::parser parser{
        root,
        terminals,
        nonTerminals,
        rules(
            conditionExpression(trueLiteral) >=
                [](const auto&) { return true; },
            conditionExpression(falseLiteral) >=
                [](const auto&) { return false; }
        )
    };

and try to parse

parser.parse(ctpg::buffers::cstring_buffer{"tr"});

moritz-geier avatar Apr 22 '24 13:04 moritz-geier

This is a dirty fix, in my opinion we should use std::distance within dfa_match to initiate rt.len , thus eliminating the need for the ifs afer calling the function. This would mean we need to enforce the iterators of each buffer to provide a operator- implementation.

moritz-geier avatar Apr 22 '24 13:04 moritz-geier

First of all I am reluctant to commit any new work until I merge my term groups feature into master. The only thing left to do (as always) is msvc compilation, which I just don't have time to do at the moment.

I would gladly accept help regarding this topic.

The feature is very nice itself, it allows much smaller parsers and compile times when grouping similar terminal symbols. C language for instance has 11 assignment operators which from parser point of view are the same.

Second thing, after I'm done with the grouped terms feature, we can think of a better solution (you already mentioned a nice one) not this dirty one.

Would you be willing to work on making the term-groups branch work on msvc?

peter-winter avatar Apr 26 '24 11:04 peter-winter

Yes I can try and get it working.

moritz-geier avatar Apr 27 '24 16:04 moritz-geier