jsoup
jsoup copied to clipboard
Add option to sort attributes in HTML output.
This is helpful to compare documents and consider documents equal if the only difference is attributes order.
Hi,
If the goal is to:
consider documents equal if the only difference is attributes order
Why not make this a comparator function of an Element / node? E.g. we have equals()
and hasSameValue()
.
And can you provide some detail on why that's useful / required? What documents do you have that you are testing for this and why is the attribute sort order the only unstable (?) thing about them? Trying to understand the utility of it.
Hi,
If the goal is to:
consider documents equal if the only difference is attributes order
Why not make this a comparator function of an Element / node? E.g. we have
equals()
andhasSameValue()
.And can you provide some detail on why that's useful / required? What documents do you have that you are testing for this and why is the attribute sort order the only unstable (?) thing about them? Trying to understand the utility of it.
I would like to use this for approval tests. The order of attributes can be changed by the developer in a template, or it can change depending on the version of the template engine, but it should have no impact on the outcome of the test. But if the value of an attribute changes, the HTML must validated again, because it could have an impact on the rendered document. Ideally, I would render the HTML, take a snapshot image, and compare the images pixel by pixel, but that would be slower and more complicated. Comparing the HTML while ignoring changes that are known not to have an impact on the rendered document looks like a good compromise.
With the proposed patch, I could do somehting like this :
val actualHtml = Jsoup.parse(actual);
actualHtml.outputSettings().sortAttributes(false);
val expectedHtml = Jsoup.parse(expected);
expectedHtml.outputSettings().sortAttributes(true);
if (actualHtml.outerHtml().equals(expectedHtml.outerHtml()))
return VerifyResult.SUCCESS;
else
return VerifyResult.FAILURE;
But it could indeed be a comparator.
Thanks for the context, the use case makes sense. On review, I am choosing not to include this feature as-is.
Here's an example of how to implement it in user-space:
import org.jsoup.Jsoup;
import org.jsoup.nodes.Attribute;
import org.jsoup.nodes.Document;
import java.util.Comparator;
import java.util.List;
class Scratch {
public static void main(String[] args) {
String html = "<p beta=1 alpha=2>One\n<p foo=3 bar=4>Two";
Document doc = Jsoup.parse(html);
System.out.println("Original: ");
System.out.println(doc.html());
doc.stream().forEach(el -> {
List<Attribute> attributes = el.attributes().asList();
el.clearAttributes();
attributes.stream().sorted(Comparator.comparing(Attribute::getKey)).forEach(attr -> {
el.attributes().put(attr);
});
});
System.out.println("\n\nAfter sorting: ");
System.out.println(doc.html());
}
}
Giving:
Original:
<html>
<head></head>
<body>
<p beta="1" alpha="2">One</p>
<p foo="3" bar="4">Two</p>
</body>
</html>
After sorting:
<html>
<head></head>
<body>
<p alpha="2" beta="1">One</p>
<p bar="4" foo="3">Two</p>
</body>
</html>
And because this modifies the order in the DOM, the other DOM comparators will then consider the trees equal.