antlr3
antlr3 copied to clipboard
Jump to the invalid address inifinite loop
Simple line tokenizer causes initinite error loop on random occasions, usually when parser invocation is done in a loop with minimal pauses between iterations. I am reusing parser, lexer and input stream.
Valgrind output
==13814== Memcheck, a memory error detector
==13814== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==13814== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==13814== Command: ../../bin/test/cli_service -f cli.pmcfg
==13814== Parent PID: 4252
==13814==
==13814== Jump to the invalid address stated on the next line
==13814== at 0x7323E80: ???
==13814== by 0x7719B1F: antlr::parse_line(std::string*, std::string*, int, int*, antlr::PMinkParser*) (antlr_utils.cpp:570)
==13814== by 0x7706959: block_handler (test_module.cpp:197)
==13814== by 0x421A15: pmink_utils::run_external_method(char const*, char const*, void*, bool) (pmink_utils.cpp:81)
==13814== by 0x415D9A: cli::CLIService::start() (cli.cpp:539)
==13814== by 0x4136F4: main (cli_service.cpp:103)
==13814== Address 0x7323e80 is not stack'd, malloc'd or (recently) free'd
....
==13814==
==13814== More than 1000 different errors detected. I'm not reporting any more.
==13814== Final error counts will be inaccurate. Go fix your program!
==13814== Rerun with --error-limit=no to disable this cutoff. Note
==13814== that errors may occur in your program without prior warning from
==13814== Valgrind, because errors are no longer being displayed.
==13814==
Grammar excerpt(only relevant parts)
lineParser : (id+=IDENTIFIER | id+=CSTRING | id+=NUMBER)* -> ^(LINE_ROOT $id*)
;
NUMBER : DIGIT+;
fragment
DIGIT : '0'..'9'
;
IDENTIFIER
: LETTER (LETTER|JavaIDDigit)*
| '/' (LETTER|JavaIDDigit|'/'|'.')*
;
CSTRING
: '"' ( EscapeSequence | ~('\\'|'"') )* '"'
;
fragment
EscapeSequence
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|APOSTROPHE|'\\')
;
Code that causes errors
// tokenize line, reuse
void antlr::parse_line(string* data, string* result, int result_max_size, int* result_size, PMinkParser* parser_info){
if(parser_info == NULL) return;
pANTLR3_BASE_TREE tmp_tree = NULL;
string tmp_str;
// reset error state
parser_info->lexer->pLexer->rec->state->errorCount = 0;
parser_info->parser->pParser->rec->state->errorCount = 0;
// input stream
parser_info->input->reuse(parser_info->input, (unsigned char*)data->c_str(), data->size(), (unsigned char*)"line_stream");
// token stream
parser_info->tstream->reset(parser_info->tstream);
// parse and build ast
pminkParser_lineParser_return ast = parser_info->parser->lineParser(parser_info->parser);
// err check
int err_c = parser_info->lexer->pLexer->rec->getNumberOfSyntaxErrors(parser_info->lexer->pLexer->rec);
err_c += parser_info->parser->pParser->rec->getNumberOfSyntaxErrors(parser_info->parser->pParser->rec);
*result_size = 0;
//print_tree(ast.tree, 0);
if(err_c == 0 && ast.tree != NULL){
if(ast.tree->children != NULL){
// child count
int n = ast.tree->children->size(ast.tree->children);
for(int i = 0; i<n; i++){
// check buffer
if(*result_size >= result_max_size) return;
// inc result size
(*result_size)++;
// get node value
tmp_tree = (pANTLR3_BASE_TREE)ast.tree->children->get(ast.tree->children, i);
tmp_str = (char*)tmp_tree->toString(tmp_tree)->chars;
// remove double quotes
//tmp_str.erase(remove(tmp_str.begin(), tmp_str.end(), '"' ), tmp_str.end());
// result
result[i] = tmp_str;
}
}
}
}
The line that causes this error as reported by Valgrind is the following one(line #570):
pminkParser_lineParser_return ast = parser_info->parser->lineParser(parser_info->parser);
This also happens when I'm using other more complex rules apart from this simple lineParser rule. I think the problem is somehow caused by reusable objects(reuse methods) since I was unable to reproduce this error when parser/lexer/input stream instances are allocated and deallocated in every iteration.