sly icon indicating copy to clipboard operation
sly copied to clipboard

UnboundLocalError local variable 'pattern' referenced before assignment

Open js-truework opened this issue 1 year ago • 3 comments

Note: I am aware that this project is no longer being maintained. I'm posting this issue here so that if others experience it, they might be able to work around it as well.

Sly's Lexer class behaves strangely when being dynamically imported, such as through ddtrace-run's _exec_module. This behavior seems to stem from the class Lexer(metaclass=LexerMeta). LexerMeta.__new__ seems to be called on dynamic import, presumably to construct the Lexer class. When LexerMeta.__new__ calls cls._build(), Lexer._build() runs and errors out on this line:

https://github.com/dabeaz/sly/blob/40009882310b6f1022ab050ff7bf6025500ee45c/src/sly/lex.py#L307

The data collected at the time of error suggests that the contents of value, being evaluated just above the failing line, is currently the re module, which is both not a str and not a callable, so pattern never gets defined, which causes the titular UnboundLocalError.

https://github.com/dabeaz/sly/blob/40009882310b6f1022ab050ff7bf6025500ee45c/src/sly/lex.py#L299-L307

Screenshot from sentry showing relevant information:

Screen Shot 2022-12-07 at 11 55 34

Workaround

The workaround that we've come up with is to manually from sly import Lexer prior to any of the dynamic importing happening and mark it with a noqa tag to avoid the unused import from getting removed by code quality tools. Manually importing the module first seems to fix the problem.

Possible solution?

I think, if there were to be a fix in Sly, the fix would require determining what to do with pattern when value is neither a str or a callable and then having that as the default behavior. I admit that I do not understand sly very well to understand what implications having a default value here would cause, but it does seem that the lack of an else clause in evaluating the contents of value is the fundamental issue at play here.

js-truework avatar Dec 07 '22 20:12 js-truework

What is the SLY input that is causing this problem?

Note: I am more than happy to fix bugs in the project, but will not be making further package releases. People who want current code should vendor SLY into their project.

dabeaz avatar Dec 07 '22 21:12 dabeaz

As far as I can tell, this is happening at import time from a dynamic import calling exec_module. I don't think it's happening at any actual SLY input. But I admit that I really don't understand how metaclasses work, so I don't understand how LexerMeta.__new__() is being called by simply running from sly import Parser in a separate package (in this case, scim2-filter-parser is doing that import, and a recursive dynamic import is getting hung up on from .lex import * from sly/__init__.py.

I don't understand how Lexer._collect_rules() is injecting the re module into the rules, but the re module is definitely not a string or a callable, which is why pattern is undefined.

Maybe additional sentry output might be helpful? Sorry, this is a bug that I really don't understand well and it seems to be related to dynamic importing and metaclass evaluation, both of which are topics that I don't understand particularly well!

ddtrace calls exec_module on sly.lex

Screen Shot 2022-12-07 at 14 11 00

class Lexer(metaclass=LexerMeta) is evaluated

Screen Shot 2022-12-07 at 14 11 20

LexerMeta.__new__() is run

Screen Shot 2022-12-07 at 14 11 46

cls._build() runs on the Lexer class

One of the rules being iterated over has a value containing the re module.

Screen Shot 2022-12-07 at 14 12 01

js-truework avatar Dec 07 '22 22:12 js-truework

There's definitely an improvement that can be made with regard to the if-elif statement. It's unclear to me why this would work with a normal import and not with a dynamic import however. That will need to be investigated a bit further.

dabeaz avatar Dec 07 '22 22:12 dabeaz