antlr3 icon indicating copy to clipboard operation
antlr3 copied to clipboard

Jump to the invalid address inifinite loop

Open dfranusic opened this issue 12 years ago • 1 comments

Simple line tokenizer causes initinite error loop on random occasions, usually when parser invocation is done in a loop with minimal pauses between iterations. I am reusing parser, lexer and input stream.

Valgrind output

==13814== Memcheck, a memory error detector
==13814== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==13814== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==13814== Command: ../../bin/test/cli_service -f cli.pmcfg
==13814== Parent PID: 4252
==13814==
==13814== Jump to the invalid address stated on the next line
==13814==    at 0x7323E80: ???
==13814==    by 0x7719B1F: antlr::parse_line(std::string*, std::string*, int, int*, antlr::PMinkParser*) (antlr_utils.cpp:570)
==13814==    by 0x7706959: block_handler (test_module.cpp:197)
==13814==    by 0x421A15: pmink_utils::run_external_method(char const*, char const*, void*, bool) (pmink_utils.cpp:81)
==13814==    by 0x415D9A: cli::CLIService::start() (cli.cpp:539)
==13814==    by 0x4136F4: main (cli_service.cpp:103)
==13814==  Address 0x7323e80 is not stack'd, malloc'd or (recently) free'd
....
==13814==
==13814== More than 1000 different errors detected.  I'm not reporting any more.
==13814== Final error counts will be inaccurate.  Go fix your program!
==13814== Rerun with --error-limit=no to disable this cutoff.  Note
==13814== that errors may occur in your program without prior warning from
==13814== Valgrind, because errors are no longer being displayed.
==13814==

Grammar excerpt(only relevant parts)

lineParser : (id+=IDENTIFIER | id+=CSTRING | id+=NUMBER)* -> ^(LINE_ROOT $id*)
;

NUMBER  :       DIGIT+;

fragment
DIGIT   :       '0'..'9'
        ;

IDENTIFIER
    :   LETTER (LETTER|JavaIDDigit)*
    |   '/' (LETTER|JavaIDDigit|'/'|'.')*
    ;


CSTRING
    :  '"' ( EscapeSequence | ~('\\'|'"') )* '"'
    ;

fragment
EscapeSequence
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|APOSTROPHE|'\\')
    ;

Code that causes errors

// tokenize line, reuse
void antlr::parse_line(string* data, string* result, int result_max_size, int* result_size, PMinkParser* parser_info){
    if(parser_info == NULL) return;

    pANTLR3_BASE_TREE tmp_tree = NULL;
    string tmp_str;

    // reset error state
    parser_info->lexer->pLexer->rec->state->errorCount = 0;
    parser_info->parser->pParser->rec->state->errorCount = 0;

    // input stream
    parser_info->input->reuse(parser_info->input, (unsigned char*)data->c_str(), data->size(), (unsigned char*)"line_stream");

    // token stream
    parser_info->tstream->reset(parser_info->tstream);

    // parse and build ast
    pminkParser_lineParser_return ast = parser_info->parser->lineParser(parser_info->parser);
    // err check
    int err_c = parser_info->lexer->pLexer->rec->getNumberOfSyntaxErrors(parser_info->lexer->pLexer->rec);
    err_c += parser_info->parser->pParser->rec->getNumberOfSyntaxErrors(parser_info->parser->pParser->rec);

    *result_size = 0;
    //print_tree(ast.tree, 0);
    if(err_c == 0 && ast.tree != NULL){
        if(ast.tree->children != NULL){
            // child count
            int n = ast.tree->children->size(ast.tree->children);
            for(int i = 0; i<n; i++){
                // check buffer
                if(*result_size >= result_max_size) return;
                // inc result size
                (*result_size)++;
                // get node value
                tmp_tree = (pANTLR3_BASE_TREE)ast.tree->children->get(ast.tree->children, i);
                tmp_str = (char*)tmp_tree->toString(tmp_tree)->chars;
                // remove double quotes
                //tmp_str.erase(remove(tmp_str.begin(), tmp_str.end(), '"' ), tmp_str.end());
                // result
                result[i] = tmp_str;
            }
        }

    }


}

The line that causes this error as reported by Valgrind is the following one(line #570):

pminkParser_lineParser_return ast = parser_info->parser->lineParser(parser_info->parser);

dfranusic avatar Nov 17 '12 16:11 dfranusic

This also happens when I'm using other more complex rules apart from this simple lineParser rule. I think the problem is somehow caused by reusable objects(reuse methods) since I was unable to reproduce this error when parser/lexer/input stream instances are allocated and deallocated in every iteration.

dfranusic avatar Nov 19 '12 18:11 dfranusic