Test failures with libxml2 2.14
After upgrading libxml2 from 2.13.8 to 2.14.3 we are seeing the following test failures with pyquery 2.0.1.
____________________ TestManipulating.test_val_for_textarea ____________________
self = <tests.test_pyquery.TestManipulating testMethod=test_val_for_textarea>
def test_val_for_textarea(self):
d = pq(self.html3)
self.assertEqual(d('#textarea-single').val(), 'Spam')
self.assertEqual(d('#textarea-single').text(), 'Spam')
d('#textarea-single').val('42')
self.assertEqual(d('#textarea-single').val(), '42')
# Note: jQuery still returns 'Spam' here.
self.assertEqual(d('#textarea-single').text(), '42')
multi_expected = '''Spam\n<b>Eggs</b>\nBacon'''
> self.assertEqual(d('#textarea-multi').val(), multi_expected)
E AssertionError: 'Spam\n<b>Eggs</b>\nBacon' != 'Spam\n<b>Eggs</b>\nBacon'
E Spam
E - <b>Eggs</b>
E + <b>Eggs</b>
E Bacon
tests/test_pyquery.py:534: AssertionError
_______________________ TestHTMLParser.test_replaceWith ________________________
self = <tests.test_pyquery.TestHTMLParser testMethod=test_replaceWith>
def test_replaceWith(self):
expected = '''<div class="portlet">
<a href="/toto">TestimageMy link text</a>
<a href="/toto2">imageMy link text 2</a>
Behind you, a three-headed HTML&dash;Entity!
</div>'''
d = pq(self.html)
d('img').replace_with('image')
val = d.__html__()
> assert val == expected, (repr(val), repr(expected))
E AssertionError: ('\'<div class="portlet">\
E <a href="/toto">TestimageMy link text</a>\
E <a href="/toto2">imageMy link text... <a href="/toto2">imageMy link text 2</a>\
E Behind you, a three-headed HTML&dash;Entity!\
E </div>\'')
E assert '<div class="...!\n </div>' == '<div class="...!\n </div>'
E
E Skipping 144 identical leading characters in diff, use -v to show
E - eaded HTML&dash;Entity!
E ? ^^^^^^^^^^
E + eaded HTML‐Entity!
E ? ^
E </div>
tests/test_pyquery.py:810: AssertionError
________________ TestHTMLParser.test_replaceWith_with_function _________________
self = <tests.test_pyquery.TestHTMLParser testMethod=test_replaceWith_with_function>
def test_replaceWith_with_function(self):
expected = '''<div class="portlet">
TestimageMy link text
imageMy link text 2
Behind you, a three-headed HTML&dash;Entity!
</div>'''
d = pq(self.html)
d('a').replace_with(lambda i, e: pq(e).html())
val = d.__html__()
> assert val == expected, (repr(val), repr(expected))
E AssertionError: ('\'<div class="portlet">\
E TestimageMy link text\
E imageMy link text 2\
E Behind you, a three-headed...imageMy link text\
E imageMy link text 2\
E Behind you, a three-headed HTML&dash;Entity!\
E </div>\'')
E assert '<div class="...!\n </div>' == '<div class="...!\n </div>'
E
E Skipping 103 identical leading characters in diff, use -v to show
E - eaded HTML&dash;Entity!
E ? ^^^^^^^^^^
E + eaded HTML‐Entity!
E ? ^
E </div>
tests/test_pyquery.py:821: AssertionError
__________________________ TestWebScrapping.test_get ___________________________
self = <tests.test_pyquery.TestWebScrapping testMethod=test_get>
def test_get(self):
d = pq(url=self.application_url, data={'q': 'foo'},
method='get')
print(d)
> self.assertIn('REQUEST_METHOD: GET', d('p').text())
E AssertionError: 'REQUEST_METHOD: GET' not found in ''
tests/test_pyquery.py:902: AssertionError
----------------------------- Captured stdout call -----------------------------
<span>HTTP_ACCEPT: */*
HTTP_ACCEPT_ENCODING: gzip, deflate
HTTP_CONNECTION: keep-alive
HTTP_HOST: 127.0.0.1:52473
HTTP_USER_AGENT: python-requests/2.32.3
PATH_INFO: /
QUERY_STRING: q=foo
REMOTE_ADDR: 127.0.0.1
REMOTE_HOST: 127.0.0.1
REMOTE_PORT: 37262
REQUEST_METHOD: GET
REQUEST_URI: /?q=foo
SCRIPT_NAME:
SERVER_NAME: waitress.invalid
SERVER_PORT: 52473
SERVER_PROTOCOL: HTTP/1.1
SERVER_SOFTWARE: waitress
waitress.client_disconnected: <bound method="" httpchannel.check_client_disconnected="" of="" <waitress.channel.httpchannel="" connected="" 127.0.0.1:37262="" at="" 0x7ffff52f3f00=
webob._parsed_query_vars: (GET([('q', 'foo')]), 'q=foo')
wsgi.errors: <encodedfile name="<_io.FileIO name=8 mode='rb+' closefd=True>" mode="r+" encoding="utf-8">
wsgi.file_wrapper: <class 'waitress.buffers.readonlyfilebasedbuffer'="">
wsgi.input: <_io.BytesIO object at 0x7ffff51e04f0>
wsgi.input_terminated: True
wsgi.multiprocess: False
wsgi.multithread: True
wsgi.run_once: False
wsgi.url_scheme: 'http'
wsgi.version: (1, 0)
</class></encodedfile></bound></span>
__________________________ TestWebScrapping.test_post __________________________
self = <tests.test_pyquery.TestWebScrapping testMethod=test_post>
def test_post(self):
d = pq(url=self.application_url, data={'q': 'foo'},
method='post')
> self.assertIn('REQUEST_METHOD: POST', d('p').text())
E AssertionError: 'REQUEST_METHOD: POST' not found in ''
tests/test_pyquery.py:908: AssertionError
________________________ TestWebScrapping.test_session _________________________
self = <tests.test_pyquery.TestWebScrapping testMethod=test_session>
def test_session(self):
if HAS_REQUEST:
import requests
session = requests.Session()
session.headers.update({'X-FOO': 'bar'})
d = pq(url=self.application_url, data={'q': 'foo'},
method='get', session=session)
> self.assertIn('HTTP_X_FOO: bar', d('p').text())
E AssertionError: 'HTTP_X_FOO: bar' not found in ''
tests/test_pyquery.py:918: AssertionError
The first link I found is https://dart.reviewpoint.org/blog/py3-lxml-fails-to-build
I guess the real problem is with lxml
Right, sorry for not mentioning that earlier. We also upgraded lxml from 5.3.1 to 5.4.0.
https://github.com/lxml/lxml/blob/lxml-5.4.0/CHANGES.txt
And the changelog says "Binary wheels use libxml2 2.13.8". So I'm not sure it's compatible with 2.14. That's the point.
I experience the same test failure, with libxml2 2.14.5, python-lxml 6.0.2, python-pyquery 2.0.1.
Here are build logs:
python-lxml
https://build.opensuse.org/package/show/home:pgajdos:libxml2/python-lxml
https://build.opensuse.org/package/live_build_log/home:pgajdos:libxml2/python-lxml/openSUSE_Tumbleweed/x86_64
python-pyquery
https://build.opensuse.org/package/show/home:pgajdos:libxml2/python-pyquery
https://build.opensuse.org/package/live_build_log/home:pgajdos:libxml2/python-pyquery:test/openSUSE_Tumbleweed/x86_64
python-lxml 6.0.2 changelog says:
- LP#2125278: Compilation with libxml2 2.15.0 failed. Original patch by Xi Ruoyao.
- Setting
decompress=Truein the parser had no effect in libxml2 2.15. - Binary wheels on Linux and macOS use the library version libxml2 2.14.6. See https://gitlab.gnome.org/GNOME/libxml2/-/releases/v2.14.6
- Test failures in libxml2 2.15.0 were fixed.
So I guess it is supposed to work with libxml2 2.14. If you find out I could provide more info, let me know. I have not much background in this area, though.
(I am skipping these tests now, otherwise it fails as the reporter says. I can remove them from skip list whenever you want.)