AdvancedHTMLParser
AdvancedHTMLParser copied to clipboard
AttributeError: 'NoneType' object has no attribute 'strip'
Desctiption: Getting an AttributeError when passing an html-like string with a corrupted <style> tag in the AdvancedHTMLParser.AdvancedHTMLParser().parseStr method.
String input:
<!DOCTYPE html><html><head><title>W33ZpsIOCysn9GGU45y0LW9EpuPHBlAuxCRRusKRvowefQLMy2</title><style:p { color: red; }</style></head><body><ul><li>rp52OnfCuzqBsp7</li><li>wrAAhIfvfpvMeyoTdmoF1oxezMhscNlgTqo0fPhfUS7XWZvECi2iVMsldLpqJq6W34KuOeoJ74cx5</li><li>8ymeXTKNEDb3jDnYwKt3lFMc4s7pJxDIVgSXljWIlOjv7JGr8cXf8SJOmpiyD05PyTzj9UATCFo1XqBpCqXR7KcjUYinCI4kZYI</li></ul> 6L1gB6g0z</body></html>
Bytearray input:
[60, 33, 68, 79, 67, 84, 89, 80, 69, 32, 104, 116, 109, 108, 62, 60, 104, 116, 109, 108, 62, 60, 104, 101, 97, 100, 62, 60, 116, 105, 116, 108, 101, 62, 87, 51, 51, 90, 112, 115, 73, 79, 67, 121, 115, 110, 57, 71, 71, 85, 52, 53, 121, 48, 76, 87, 57, 69, 112, 117, 80, 72, 66, 108, 65, 117, 120, 67, 82, 82, 117, 115, 75, 82, 118, 111, 119, 101, 102, 81, 76, 77, 121, 50, 60, 47, 116, 105, 116, 108, 101, 62, 60, 115, 116, 121, 108, 101, 58, 112, 32, 123, 32, 99, 111, 108, 111, 114, 58, 32, 114, 101, 100, 59, 32, 125, 60, 47, 115, 116, 121, 108, 101, 62, 60, 47, 104, 101, 97, 100, 62, 60, 98, 111, 100, 121, 62, 60, 117, 108, 62, 60, 108, 105, 62, 114, 112, 53, 50, 79, 110, 102, 67, 117, 122, 113, 66, 115, 112, 55, 60, 47, 108, 105, 62, 60, 108, 105, 62, 119, 114, 65, 65, 104, 73, 102, 118, 102, 112, 118, 77, 101, 121, 111, 84, 100, 109, 111, 70, 49, 111, 120, 101, 122, 77, 104, 115, 99, 78, 108, 103, 84, 113, 111, 48, 102, 80, 104, 102, 85, 83, 55, 88, 87, 90, 118, 69, 67, 105, 50, 105, 86, 77, 115, 108, 100, 76, 112, 113, 74, 113, 54, 87, 51, 52, 75, 117, 79, 101, 111, 74, 55, 52, 99, 120, 53, 60, 47, 108, 105, 62, 60, 108, 105, 62, 56, 121, 109, 101, 88, 84, 75, 78, 69, 68, 98, 51, 106, 68, 110, 89, 119, 75, 116, 51, 108, 70, 77, 99, 52, 115, 55, 112, 74, 120, 68, 73, 86, 103, 83, 88, 108, 106, 87, 73, 108, 79, 106, 118, 55, 74, 71, 114, 56, 99, 88, 102, 56, 83, 74, 79, 109, 112, 105, 121, 68, 48, 53, 80, 121, 84, 122, 106, 57, 85, 65, 84, 67, 70, 111, 49, 88, 113, 66, 112, 67, 113, 88, 82, 55, 75, 99, 106, 85, 89, 105, 110, 67, 73, 52, 107, 90, 89, 73, 60, 47, 108, 105, 62, 60, 47, 117, 108, 62, 32, 54, 76, 49, 103, 66, 54, 103, 48, 122, 60, 47, 98, 111, 100, 121, 62, 60, 47, 104, 116, 109, 108, 62]
Code that reproduces the error:
import AdvancedHTMLParser
parser = AdvancedHTMLParser.AdvancedHTMLParser()
parser.parseStr(string_input) # The same string_input as above in issue
Expected Result: Ignore invalid input or raise a specified exception (like MultipleRootNodeException)
Actual Result:
Traceback (most recent call last):
File "C:\Users\AmEl\IdeaProjects\Joker2023\src\main\python\main.py", line 55, in main
python_method(input_data)
File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Parser.py", line 980, in parseStr
self.feed(html)
File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Parser.py", line 948, in feed
HTMLParser.feed(self, contents)
File "C:\Users\AmEl\AppData\Local\Programs\Python\Python312\Lib\html\parser.py", line 111, in feed
self.goahead(0)
File "C:\Users\AmEl\AppData\Local\Programs\Python\Python312\Lib\html\parser.py", line 171, in goahead
k = self.parse_starttag(i)
^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\AmEl\AppData\Local\Programs\Python\Python312\Lib\html\parser.py", line 338, in parse_starttag
self.handle_starttag(tag, attrs)
File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Parser.py", line 138, in handle_starttag
newTag = AdvancedTag(tagName, attributeList, isSelfClosing, ownerDocument=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\Tags.py", line 196, in __init__
myAttributes[key] = value
~~~~~~~~~~~~^^^^^
File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\SpecialAttributes.py", line 96, in __setitem__
tag.style = StyleAttribute(value, tag)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\SpecialAttributes.py", line 424, in __init__
self._styleDict = StyleAttribute.styleToDict(styleValue)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\AmEl\IdeaProjects\Joker2023\venv\Lib\site-packages\AdvancedHTMLParser\SpecialAttributes.py", line 650, in styleToDict
styleStr = styleStr.strip()
^^^^^^^^^^^^^^
AttributeError: 'NoneType' object has no attribute 'strip'
Additional information:
- OS: Windows 10, 22H2 (19045.4984)
- Python version: Python 3.12.6
- You can achieve this error on input like this:
<s</style>
P.s. You can see the same info in reportAttributeError.txt