rdflib
rdflib copied to clipboard
RFE: move away from deprecated `html5lib`
Is your feature request related to a problem? Please describe.
It would be nice tu cut tail of some legacy modules decencies.
One of those modules is html5lib
.
Describe the solution you'd like
it wold be good to remove use od=f the html5lib deprecated html5lib
like it has been done with pip ~2 years ago.
https://github.com/pypa/pip/pull/11259
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
Additional context
html5lib
depends on six
which is on list of deprecated modules even longer implanting this RFE would make easier kill two birds using one stone 😋
I support this idea of moving away from html5lib
. @kloczek please do make a PR for this!
I've looked into this, it looks like html5lib is used by Literals with data-type rdf:HTML. HTML5Lib parses the literal lexical, checks its valid, and normalizes it (not sure what that does). HTML5Lib is also used for serializing rdf:HTML literals back to text when required.
Two candidates for replacing this are BeautifulSoup (I've used this before, but its quite different than html5lib) and the built-in python html.parse()
(that is what the pip
library used when moving away from html5lib).
I hope you guys can solve this! A html5lib issue is still hindering the use of the RDF-based html vocabulary (despite your efforts to get it fixed), together with another RDFLib issue. Would be so great if this could be solved!
@kloczek @floresbakker
I have a PR #2911 that will replace html5lib with html5lib-modern, that does not use six
.
Note however, this is not the last source of six sub-dependency.
The isodate module used in RDFLib also depends on six, and also hasn't been updated in over 3 years, (though it is not marked as deprecates like html5lib)
The path to fix, upgrade or replace isodate is not so clear.
@kloczek @floresbakker I have a PR https://github.com/RDFLib/rdflib/pull/2911 that will replace html5lib with html5lib-modern, that does not use six.
Cannot find html5lib-modern
on pypi.
Why not move to html5lib
? 🤔
@kloczek It is published on Pypi here: https://pypi.org/project/html5lib-modern/
Why not move to
html5lib
?
html5lib
is the abandoned and deprecated dependency we are moving away from (you are the one who raised the issue).
Why not move to
html5lib
?
html5lib
is the abandoned and deprecated dependency we are moving away from (you are the one who raised the issue).
My mistake. Sorry.