lexer
lexer copied to clipboard
ResetPosition doesn't work by token position
When you have a lexer with formats containing multiple characters the reset position doesn't work as expected. Position is an internal pointer to the position of the lexer in the tokens array and not to the position as exposed in the token itself.
private function parseNamedReference(): string
{
$startPosition = $this->lexer->token['position'];
while ($this->lexer->moveNext()) {
}
$this->lexer->resetPosition($startPosition);
$this->lexer->moveNext();
$this->lexer->moveNext();
}
In the example above I would expect that a resetPosition would throw me back to the position on method entry. But since my tokens do have multiple characters, this doesn't work. A fix would be to set the index of each token. Like this:
$this->tokens[$match[1]] = [
'value' => $match[0],
'type' => $type,
'position' => $match[1],
];
However, this would break the step process using $this->position++
another solution could be to have a map between the token position and location in the tokens array. This would have an impact on the memory usage since it would require an extra array of integers.
I would be happy to provide a patch to fix this issue, but I would like to have some guidance on what is expected in this library. Any change in resetPosition would be a breaking change as it would change the behavior of this lib.
My workaround for now:
public function resetPosition($position = 0)
{
parent::resetPosition($this->tokenPositions[$position]);
}
protected function scan($input)
{
parent::scan($input); // TODO: Change the autogenerated stub
$class = new \ReflectionClass(AbstractLexer::class);
$property = $class->getProperty('tokens');
$property->setAccessible(true);
$tokens = $property->getValue($this);
$this->tokenPositions = array_flip(array_column($tokens, 'position'));
}
Hi 👋 is this PR https://github.com/doctrine/lexer/pull/12 should solve your problem ?