wikitextparser
wikitextparser copied to clipboard
Infinite loop with CR (\x09) in table parsing
In one of my test-sets I forgot to sanitize the input, and in result I had a \r
(without any \n
). This caused a funny effect I thought you might want to know.
import wikitextparser
wikitextparser.parse("{|\n}\n\r").get_tables()[0].data()
This causes an infinite loop. Similar, if you replace \r
with \x0b
or \x0c
, but that is even more nonsense ofc.
https://github.com/5j9/wikitextparser/blob/f64e098b0ba040595f6fc427edf6409308761bd0/wikitextparser/_table.py#L94
returns -1
, after which _lstrip_increase
increases that back to 0, and it repeats.
Personally, I think this is not a bug in your library, as a string ending on a \r
is just weird. But I didn't want to keep this finding from you either, just in case I am missing something else here :)
I also found a possibly related issue. For example:
import wikitextparser
wikitextparser.parse("{|\n}\n").get_tables()[0].data()
triggers:
IndexError: bytearray index out of range
on line 93 of _table.py
..
Sadly, this is text users in TrueWiki have been entering, but I can capture that error on my side. Just mentioning it, as there might be something else going on here actually :)
As always, tnx for the awesome library! :D