webpreview icon indicating copy to clipboard operation
webpreview copied to clipboard

BeautifulSoup prints a GuessedAtParserWarning

Open NelsonMinar opened this issue 1 year ago • 1 comments

Running webpreview in default configuration yields this error

webpreview/previews.py:51: GuessedAtParserWarning: No parser was explicitly specified, so I'm using the best available HTML parser for this system ("html.parser"). This usually isn't a problem, but if you run this code on another system, or in a different virtual environment, it may use a different parser and behave differently.

The code that caused this warning is on line 51 of the file /home/nelson/src/linkblog/pinboard-to-static/venv/lib/python3.9/site-packages/webpreview/previews.py. To get rid of this warning, pass the additional argument 'features="html.parser"' to the BeautifulSoup constructor.

Presumably a change in BeautifulSoup since the last webpreview release. It works anyway, just annoying. Adding the suggested features argument does make the warning go away but raises the question of whether other parsers should be configurable.

NelsonMinar avatar Jul 14 '22 18:07 NelsonMinar

Hi @NelsonMinar, thank you for highlighting this issue.

This error should be gone starting with version 1.7.2, because BeautifulSoup is now initialized with a default parser ("html.parser") unless a different one is specified.

https://github.com/ludbek/webpreview/blob/f9f778191cc613599c940aed78fbb5cf28c9a86c/webpreview/parsers.py#L163-L171 https://github.com/ludbek/webpreview/blob/f9f778191cc613599c940aed78fbb5cf28c9a86c/webpreview/parsers.py#L191

This still allows users to specify the parser of their own choice, such as slow but accurate "html5lib".

Let me know please, if the issue is gone for you in version 1.7.2 🙂

vduseev avatar Aug 12 '22 15:08 vduseev