skope-rules
skope-rules copied to clipboard
SyntaxError: Python keyword not valid identifier in numexpr query
When I add feature names to the SkopeRules model, I encounter this error.
Some of the feature names are :
data__blocked_bugs_number
data__ever_affected=False
data__ever_affected=True
data__has_crash_signature=False
data__has_crash_signature=True
data__has_github_url=False
data__has_github_url=True
data__has_str=irrelevant
data__has_str=no
Traceback (most recent call last):
File "run.py", line 55, in <module>
model.train()
File "C:\Users\Saurabh Daalia\Desktop\bugbug\bugbug\model.py", line 101, in train
self.skope_clf.fit(X_train, y_train)
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\skrules\skope_rules.py", line 350, in fit
for r in set(rules_from_tree)]
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\skrules\skope_rules.py", line 350, in <listcomp>
for r in set(rules_from_tree)]
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\skrules\skope_rules.py", line 600, in _eval_rule_perf
detected_index = list(X.query(rule).index)
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\pandas\core\frame.py", line 3088, in query
res = self.eval(expr, **kwargs)
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\pandas\core\frame.py", line 3203, in eval
return _eval(expr, inplace=inplace, **kwargs)
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\pandas\core\computation\eval.py", line 294, in eval
truediv=truediv)
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\pandas\core\computation\expr.py", line 749, in __init__
self.terms = self.parse()
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\pandas\core\computation\expr.py", line 766, in parse
return self._visitor.visit(self.expr)
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\pandas\core\computation\expr.py", line 327, in visit
raise e
File "C:\Users\Saurabh Daalia\Anaconda3\lib\site-packages\pandas\core\computation\expr.py", line 321, in visit
node = ast.fix_missing_locations(ast.parse(clean))
File "C:\Users\Saurabh Daalia\Anaconda3\lib\ast.py", line 35, in parse
return compile(source, filename, mode, PyCF_ONLY_AST)
File "<unknown>", line 1
SyntaxError: Python keyword not valid identifier in numexpr query
is it because you put =
in your feature names?
I see, I think that might be the issue. But what is causing this issue? Is there any workaround for it?
the variable names are parsed to build the rules, which causes your bug.
I don't see an easy workaround. You really shouldn't put =
in your feature names...
You really shouldn't put = in your feature names...
Feature names are strings, so it seems like a limitation to restrict what they can contain (everything else in the scikit-learn world doesn't care about it). Maybe it should be allowed, or at least documented somewhere?
you are right this should be documented. Feel free to open a PR for that or for fixing the syntax error :)
Guys, I too get the similar error, when I run the below command, if I remove the pipe, it works with only one condition
SyntaxError: Python keyword not valid identifier in numexpr query
Error is --- train_outliers = train.query('age_z > 3 | age_z < ‐3')
Guys, I too get the similar error, when I run the below command, if I remove the pipe, it works with only one condition
SyntaxError: Python keyword not valid identifier in numexpr query
Error is --- train_outliers = train.query('age_z > 3 | age_z < ‐3')
This happened to me as well. The problem was that I kept holding down the alt-key when writing the
Guys, I too get the similar error, when I run the below command, if I remove the pipe, it works with only one condition SyntaxError: Python keyword not valid identifier in numexpr query Error is --- train_outliers = train.query('age_z > 3 | age_z < ‐3')
This happened to me as well. The problem was that I kept holding down the alt-key when writing the following the pipe symbol. I encounter this frequently, as writing pipe requires me to hold alt.
Happened to me too, do anyone know how to fix?! Thanks xD
@osdiego Did you copy and paste from another document. The "-3" is not being read correctly by the query function. Try removing/deleting the minus and replacing it. Let me know if this works.
@CCNOAI I'm doing something like: (importance >= 0 | importance = -7). The question is: I need to search like that, is there no way?