antlr4-c3
antlr4-c3 copied to clipboard
collectCandidates fails when seeking the tokenStream
Whenever I call collectCandidates
for a relatively simple grammar, I get an assertion error:
data:image/s3,"s3://crabby-images/73b12/73b122430dae54206e5f6affc2546e2e0ef98efd" alt="image"
I tracked the call state:
data:image/s3,"s3://crabby-images/f8979/f897958f63500cb5bb66bc9e23ca0ce2942e69f0" alt="image"
Here we can see that collectCandidates
calls seek
on a BufferedTokenStream
from the antlr4ts
library. The seek
method in turn calls sync
:
sync(i) {
assert(i >= 0);
let n = i - this.tokens.length + 1; // how many more elements we need?
//System.out.println("sync("+i+") needs "+n); [sic]
if (n > 0) {
let fetched = this.fetch(n);
return fetched >= n;
}
return true;
}
The assertion on the first row fails some time down the line, causing the processing to halt.
By uncommenting the seeks in collectCandidates
:
collectCandidates(caretTokenIndex, context) {
this.shortcutMap.clear();
this.candidates.rules.clear();
this.candidates.tokens.clear();
this.statesProcessed = 0;
this.precedenceStack = [];
this.tokenStartIndex = context ? context.start.tokenIndex : 0;
let tokenStream = this.parser.inputStream;
let currentIndex = tokenStream.index;
// tokenStream.seek(this.tokenStartIndex); <---
tokenStream.seek(this.tokenStartIndex);
this.tokens = [];
let offset = 1;
while (true) {
let token = tokenStream.LT(offset++);
this.tokens.push(token.type);
if (token.tokenIndex >= caretTokenIndex || token.type == antlr4ts_1.Token.EOF)
break;
}
// tokenStream.seek(currentIndex); <---
tokenStream.seek(currentIndex);
It all works for me. The tests fail, which clearly means the lines are important.
What could be the reason for the seek to fail?
I'm using the library as shown in the README and tests. I cannot share my grammar file, but I am happy to assist in any bug tracing.
Just from description I would guess that your input stream is too small for the given seek index. Can't imagine any other reason why the seek call could fail. So, check the caretTokenIndex
you specify in the call to collectCandidates
.
That does not seem to be the case as the length of the input does not seem to matter, also, the caretTokenIndex
does not seem to affect any of the seeks
.
I've tried to debug a bit further and found this.
export function collectCandidates(caretTokenIndex, context) {
this.shortcutMap.clear();
this.candidates.rules.clear();
this.candidates.tokens.clear();
this.statesProcessed = 0;
this.precedenceStack = [];
this.tokenStartIndex = context ? context.start.tokenIndex : 0;
let tokenStream = this.parser.inputStream;
let currentIndex = tokenStream.index; // <-- This is -1 the first iteration
tokenStream.seek(this.tokenStartIndex); // <-- This call works (tokenStartIndex is 0)
this.tokens = [];
let offset = 1;
while (true) {
let token = tokenStream.LT(offset++);
this.tokens.push(token.type);
if (token.tokenIndex >= caretTokenIndex || token.type == Token.EOF)
break;
}
tokenStream.seek(currentIndex); // <-- This call fails as currentIndex is -1
// ...
Basically, the inputStream
never seems to have a valid index.
I'm not sure as to why this happens, a race condition seems unlikely due to the synchronous nature of JavaScript, but who nows. If I try to get the index of the tokenStream
just before the seek
, it actually returns 0
instead of -1
.
I tried to change the last seek
to ignore negative indices, tokenStream.seek(Math.max(currentIndex, 0))
, which allows all tests to pass whilst still seemingly work correctly for me. I think it is a bit naive to think that the change causes no issues further down the line, but I don't know enough about this plugin to be sure if it's really an issue.
let currentIndex = tokenStream.index;
tokenStream.seek(this.tokenStartIndex);
this.tokens = [];
let offset = 1;
while (true) {
let token = tokenStream.LT(offset++);
this.tokens.push(token.type);
if (token.tokenIndex >= caretTokenIndex || token.type == Token.EOF)
break;
}
tokenStream.seek(Math.max(currentIndex, 0)); // <-- Math.max()
Moving the currentIndex
definition to right before the tokenStream.seek
call also works:
tokenStream.seek(this.tokenStartIndex);
this.tokens = [];
let offset = 1;
while (true) {
let token = tokenStream.LT(offset++);
this.tokens.push(token.type);
if (token.tokenIndex >= caretTokenIndex || token.type == Token.EOF)
break;
}
let currentIndex = tokenStream.index; // <-- Moved index here
tokenStream.seek(currentIndex);
I believe I never tested with a cold token stream, which never had read anything. That's why I never saw the -1 token stream index. IMO this is just a marker to denote that nothing has ever been consumed by this stream. This in turns means there's no index that must be restored after the candidate collection ran.
This restoration of the current index is just a convenient feature so that the caller doesn't have to do that (if the caller code relies somehow on the current index). I think it's save not to restore the current stream index if that was -1.
I also ran into this issue, due to a cold token stream.
I think defaulting currentIndex to 0 if tokenStream.index
is -1 should be the easy fix. I don't see any negative repercussions to that.