loofah
loofah copied to clipboard
jruby test failures
So many test failures. What is going on?
Which basically blocks the use of Rails 4.2 with JRuby, at least if you are concerned about working html sanitization.
Most if not all of the problems seem to be Nokogiri-Java problems where it behaves different from C-Nokogiri. I started working on some Nokogiri test cases that reproduce what loofah does, see sparklemotion/nokogiri#1318, sparklemotion/nokogiri#1319 and sparklemotion/nokogiri#1320. At least the last one looks more like a Xalan-J bug...
I have confirmed https://github.com/sparklemotion/nokogiri/pull/1318 reduces jruby errors. https://travis-ci.org/flavorjones/loofah/builds/80234142 https://travis-ci.org/marutosi/loofah/builds/90518033
Much of this is:
- https://github.com/sparklemotion/nokogiri/pull/1319
- https://github.com/sparklemotion/nokogiri/pull/1320
which may be addressed in https://github.com/sparklemotion/nokogiri/pull/1604
Ping! Trying to get JRuby deps cleaned up again now that we support Rails 5.
@headius We (nokogiri core) need help on the JRuby implementation. I've pinged you and other members of the JRuby community a few times since November 2014 on the subject, so hopefully that's not surprising news.
Taking a look at the most recent failing pipeline:
https://ci.nokogiri.org/teams/nokogiri-core/pipelines/loofah/jobs/jruby-9.1/builds/12
most of these failures are what I'd refer to as "soft" failures, meaning that the markup is sanitized properly, but there are minor formatting differences that can be attributed to small differences in the parsing engines.
There are a few, however, that are NPEs, and for those I'd welcome PRs to Nokogiri to fix them.
Additionally, there are about 30 open issues for Nokogiri that are labeled jruby
, and I'd welcome help with those as well.
I see there are still quite some Jruby test failures. @headius @flavorjones: is Loofah currently safe to use on JRuby?
(Context: we are looking into replacing Sanitize with Loofah in gollum due to lack of JRuby support in newer versions of Sanitize.)
Ping @flavorjones
@dometto If you're intent on using Loofah on JRuby, can I ask for your help investigating some of the errors being raised by Nokogiri's JRuby implementation?
Specifically, some errors are due to the differences in behavior between libxml2 and xerces/nekohtml. But others are indicative of issues with the JRuby implementation of Nokogiri. Differences in parser behavior are likely "safe" for some definition of safe. Errors, though, are potential Denial-of-Service vectors, and so I'm hesitant to describe Loofah as "safe" when run on JRuby.
For what it's worth, we have far fewer errors now than in 2017 thanks to the efforts of @kares, @jvshahid, and others on the Nokogiri JRuby implementation.
Here are the nine errors I can clearly identify as part of this category:
test_0001_only_scrub_subtree(Document::Node::#scrub!):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::Element:0x49ede9c7>
/home/flavorjones/code/oss/loofah/test/integration/test_scrubbers.rb:179:in `block in test_0001_only_scrub_subtree'
test_0001_only_scrub_subtrees(Document::NodeSet::#scrub!):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::NodeSet:0x741f8dbe>
/home/flavorjones/code/oss/loofah/test/integration/test_scrubbers.rb:204:in `block in test_0001_only_scrub_subtrees'
test_0007_scrubs_document_nodes(HTML):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::Element:0x2af69643>
/home/flavorjones/code/oss/loofah/test/unit/test_api.rb:43:in `block in test_0007_scrubs_document_nodes'
test_0009_scrubs_document_nodesets(HTML):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::NodeSet:0x210308d5>
/home/flavorjones/code/oss/loofah/test/unit/test_api.rb:56:in `block in test_0009_scrubs_document_nodesets'
test_should_allow_xml:lang_attribute(Html5TestSanitizer):
Java::JavaLang::NullPointerException:
nokogiri.XmlNamespace.createDefaultNamespace(XmlNamespace.java:167)
nokogiri.XmlAttr.setNamespaceIfNecessary(XmlAttr.java:106)
nokogiri.XmlNode.setNode(XmlNode.java:583)
nokogiri.internals.NokogiriHelpers.constructNode(NokogiriHelpers.java:147)
nokogiri.internals.NokogiriHelpers.getCachedNodeOrCreate(NokogiriHelpers.java:126)
nokogiri.XmlNode.attribute_nodes(XmlNode.java:690)
nokogiri.XmlNode$INVOKER$i$0$0$attribute_nodes.call(XmlNode$INVOKER$i$0$0$attribute_nodes.gen)
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:325)
...
test_should_allow_xml:base_attribute(Html5TestSanitizer):
Java::JavaLang::NullPointerException:
nokogiri.XmlNamespace.createDefaultNamespace(XmlNamespace.java:167)
nokogiri.XmlAttr.setNamespaceIfNecessary(XmlAttr.java:106)
nokogiri.XmlNode.setNode(XmlNode.java:583)
nokogiri.internals.NokogiriHelpers.constructNode(NokogiriHelpers.java:147)
nokogiri.internals.NokogiriHelpers.getCachedNodeOrCreate(NokogiriHelpers.java:126)
nokogiri.XmlNode.attribute_nodes(XmlNode.java:690)
nokogiri.XmlNode$INVOKER$i$0$0$attribute_nodes.call(XmlNode$INVOKER$i$0$0$attribute_nodes.gen)
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:325)
...
test_0059_testdata_sanitizer_xml_base(Html5TestSanitizer):
Java::JavaLang::NullPointerException:
nokogiri.XmlNamespace.createDefaultNamespace(XmlNamespace.java:167)
nokogiri.XmlAttr.setNamespaceIfNecessary(XmlAttr.java:106)
nokogiri.XmlNode.setNode(XmlNode.java:583)
nokogiri.internals.NokogiriHelpers.constructNode(NokogiriHelpers.java:147)
nokogiri.internals.NokogiriHelpers.getCachedNodeOrCreate(NokogiriHelpers.java:126)
nokogiri.XmlNode.attribute_nodes(XmlNode.java:690)
nokogiri.XmlNode$INVOKER$i$0$0$attribute_nodes.call(XmlNode$INVOKER$i$0$0$attribute_nodes.gen)
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:325)
...
test_should_allow_xml:space_attribute(Html5TestSanitizer):
Java::JavaLang::NullPointerException:
nokogiri.XmlNamespace.createDefaultNamespace(XmlNamespace.java:167)
nokogiri.XmlAttr.setNamespaceIfNecessary(XmlAttr.java:106)
nokogiri.XmlNode.setNode(XmlNode.java:583)
nokogiri.internals.NokogiriHelpers.constructNode(NokogiriHelpers.java:147)
nokogiri.internals.NokogiriHelpers.getCachedNodeOrCreate(NokogiriHelpers.java:126)
nokogiri.XmlNode.attribute_nodes(XmlNode.java:690)
nokogiri.XmlNode$INVOKER$i$0$0$attribute_nodes.call(XmlNode$INVOKER$i$0$0$attribute_nodes.gen)
org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:325)
...
test_0007_scrubs_document_nodes(XML):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::Element:0x1b1e1f02>
/home/flavorjones/code/oss/loofah/test/unit/test_api.rb:106:in `block in test_0007_scrubs_document_nodes'
~~haven't tested but the NPEs should be fixed~~ with https://github.com/flavorjones/loofah/issues/88 (pretty much a 'bug' in Nokogiri ext code)
UPDATE: confirmed - the Nokogiri PR https://github.com/flavorjones/loofah/issues/88 resolves the NPEs (4 less test failures in loofah's suite)
UPDATE: confirmed - the Nokogiri PR #88 resolves the NPEs (4 less test failures in loofah's suite)
@kares: that links to this issue, not to a Nokogiri PR. Has the relevant PR been merged yet?
Sorry for the slow reply, but I would be happy to look into the remaining failures (NoMethodError: undefined method
scrub!'`). Or is someone else tackling that already in the mean time?
yy - merged but not released yet. however you could build the nokogiri gem from master, haven't looked the scrub issue but I would check how it behaves against recent changes.
Can we get a Nokogiri release soon?
The 1.11.0 milestone is accurate and there's a lot of work to be done still. Getting there as quickly as I can. Let me know if you'd like to help, I can point you at relevant issues.
On Mon, Jan 13, 2020, 6:27 PM Charles Oliver Nutter < [email protected]> wrote:
Can we get a Nokogiri release soon?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/flavorjones/loofah/issues/88?email_source=notifications&email_token=AAACAD24PDUADHHJ55O3CLLQ5T2HRA5CNFSM4BCB45UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEI2V7DY#issuecomment-573923215, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAACAD4C6BAWRY5QVDDDKMTQ5T2HRANCNFSM4BCB45UA .
@flavorjones Oh I see. I will have a look at the milestone and see if I can help.
@flavorjones I made some progress on using jar-dependencies (https://github.com/sparklemotion/nokogiri/issues/1253, prerequisite for https://github.com/sparklemotion/nokogiri/issues/1395) but need some assistance from @mkristian. I provided two possible patches for https://github.com/sparklemotion/nokogiri/issues/1836.
I can certainly try to help with other things not marked as JRuby issues, but my capabilities will be more limited. 😀
Quite a few more tests will pass once the next Nokogiri release drops with https://github.com/sparklemotion/nokogiri/pull/2174. Failures at that point should mostly be actual differences in parser behavior.
Note that the branch at https://github.com/flavorjones/loofah/pull/239 allows us to extend the test suite to add the JRuby output as valid test data; all that would need to be done is to go through the failing tests and determine if the output is sanitized and if so, add it to the test suite.