LibCST icon indicating copy to clipboard operation
LibCST copied to clipboard

Parsing Code with Syntax Errors

Open RyannDaGreat opened this issue 5 years ago • 2 comments
trafficstars

Is it possible to parse code that has syntax errors in part of the code? I'd like to refactor code that might still be incomplete (and this, might have syntax errors).

For example: print(x) def f(x,y,z): is not valid code by itself (because the function needs a body). When I call libcst.parse_expression('print(x)\ndef :'), it throws an error. While this is an understandable response, I was wondering if it would be possible to not throw the baby out with the bathwater; that is to say is it possible to recover the print(x) in the output (instead of just throwing an error).

Why I want this: When editing a python file, usually in between edits, I have invalid syntax (like in writing the above example; before adding a function body). But I'd like to be able to run refactorings anyway, like PyCharm does. Is this possible with this library?

RyannDaGreat avatar Jun 09 '20 21:06 RyannDaGreat

print function is an expression, but def is a statement; you might want to try parse_module instead. I don't think this idea works in the general case very well, and suspect PyCharm is just doing regex if it works for cases like these, but here's some untested code that splits some source into a valid cst tree, and everything after that you can merge back together later.

def lenient_parse(data)
  try:
    mod = cst.parse_module(data)
    return mod, None
  except cst.ParserSyntaxError as e:
    lines = data.splitlines(True)
    for n in range(e.raw_line, -1, -1):
      try:
        mod = cst.parse_module("".join(lines[:n]))
        bad = lines[n:]
        return mod, bad
  raise

mod, rest = lenient_parse(...)
do_refactor(mod)
return mod.code + rest

thatch avatar Jun 10 '20 02:06 thatch

I think forgiving parsing would be a reasonable feature for LibCST. It's a common feature for parsers used in IDE contexts, and Parso supports it, which might make it easier to implement in LibCST.

It's really just a question of someone having sufficient motivation to contribute it, IMO.

carljm avatar Jun 10 '20 20:06 carljm