baron icon indicating copy to clipboard operation
baron copied to clipboard

Unindented comment parsing error

Open Ahuge opened this issue 6 years ago • 2 comments

Hi, I have found an issue which is replicated below:

code = """
def foo(bar):
    if True:
# I cause a Failure
        print("Foo %s!" % bar)
"""
import redbaron

red = redbaron.RedBaron(code)

The error was:

ParsingError: Error, got an unexpected token $end here:

   1 
   2 def foo(bar):
   3     if True:
   4 # I cause a Failure
   5         print("Foo %s!" % bar)
   6 <---- here

The token $end should be one of those: ASSERT, AT, BACKQUOTE, BINARY, BINARY_RAW_STRING, BINARY_STRING, BREAK, CLASS, COMMENT, COMPLEX, CONTINUE, DEDENT, DEF, DEL, ENDL, ENDMARKER, EXEC, FLOAT, FLOAT_EXPONANT, FLOAT_EXPONANT_COMPLEX, FOR, FROM, GLOBAL, HEXA, IF, IMPORT, INT, LAMBDA, LEFT_BRACKET, LEFT_PARENTHESIS, LEFT_SQUARE_BRACKET, LONG, MINUS, NAME, NOT, OCTA, PASS, PLUS, PRINT, RAISE, RAW_STRING, RETURN, STRING, TILDE, TRY, UNICODE_RAW_STRING, UNICODE_STRING, WHILE, WITH, YIELD

I am guessing but it sounds similar to #11 being due to the python parser not caring about comments.

The parsing issue pops up on the dedented single line comment. This is valid in python, only because the python ast strips them probably.

Thanks

Ahuge avatar Aug 23 '17 20:08 Ahuge

Pretty sure I have a fix in indentation_marker.py.

The following is a snippet that I will turn into a PR soon.

@@ -61,20 +61,26 @@ def mark_indentation_generator(sequence):
 61:                 indentations.pop()
 62: 
 63:         # if were are at ":\n" like in "if stuff:\n"
-  :         if current[0] == "COLON" and iterator.show_next(1)[0] == "ENDL":
+64:         # Comments can be at whatever indentation they feel like.
+65:         if current[0] in ("COLON", "COMMENT") and iterator.show_next(1)[0] == "ENDL":
 66:             # if we aren't in "if stuff:\n\n"
 67:             if iterator.show_next(2)[0] not in ("ENDL",):
-  :                 indentations.append(get_space(iterator.show_next()))
+68:                 space = get_space(iterator.show_next())
+69:                 if space is not None:
+70:                     indentations.append(space)
 71:                 yield current
 72:                 yield next(iterator)
-  :                 yield ('INDENT', '')
+73:                 if space is not None:
+74:                     yield ('INDENT', '')
 75:                 continue
 76:             else:  # else, skip all "\n"
 77:                 yield current
 78:                 for i in iterator:
 79:                     if i[0] == 'ENDL' and iterator.show_next()[0] not in ('ENDL',):
-  :                         indentations.append(get_space(i))
-  :                         yield ('INDENT', '')
+80:                         space = get_space(i)
+81:                         if space is not None:
+82:                             indentations.append(get_space(i))
+83:                             yield ('INDENT', '')
 84:                         yield i
 85:                         break
 86:                     yield i

I am doing two things, allowing single line comments to have an indentation after their endl as well as making sure that the indentation is not a NoneType.
The latter of the two changes I would like to dig into more to make sure there are no unintended consequences for.

PR should come soonish.
Thanks for reading!

Ahuge avatar Aug 23 '17 22:08 Ahuge

Having an issue with test_indentation_marker.test_comment_in_middle_of_ifelseblock

Will have an updated PR soon.

Ahuge avatar Aug 24 '17 06:08 Ahuge