javalang
javalang copied to clipboard
Error parsing line comment in last line with no final line break
Javalang cannot parse line comments in the last line of code, if this line is terminated by the end of file instead of a line break character. This is probably a rare issue, but I ran across it while parsing an old version of Apache POI.
The following is a minimal example:
import javalang
javalang.parse.parse('// line comment')
It raises the following exception:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-42-babb856b693c> in <module>
----> 1 javalang.parse.parse('// line comment')
.../site-packages/javalang/parse.py in parse(s)
50 def parse(s):
51 tokens = tokenize(s)
---> 52 parser = Parser(tokens)
53 return parser.parse()
.../site-packages/javalang/parser.py in __init__(self, tokens)
93
94 def __init__(self, tokens):
---> 95 self.tokens = util.LookAheadListIterator(tokens)
96 self.tokens.set_default(EndOfInput(None))
97
.../site-packages/javalang/util.py in __init__(self, iterable)
90 class LookAheadListIterator(object):
91 def __init__(self, iterable):
---> 92 self.list = list(iterable)
93
94 self.marker = 0
.../site-packages/javalang/tokenizer.py in tokenize(self)
506 elif startswith in ("//", "/*"):
507 comment = self.read_comment()
--> 508 if comment.startswith("/**"):
509 self.javadoc = comment
510 continue
AttributeError: 'NoneType' object has no attribute 'startswith'
I think you should manually add newline character at the end of each file. For example:
import javalang
javalang.parse.parse('// line comment\n')
will fix this issue
Although the workaround is very simple, the tokenizer needs minimal changes to handle these comments. Since the fix is so simple, I believe it is worth it.