htmlq
htmlq copied to clipboard
Case sensitiveness: htmlq not preserving case?
Somehow htmlq turns the element, or tag names into lowercase:
Expected
$ echo -e '<Need>\n <PreserveCase>True</PreserveCase>\n</Need>' | htmlq Need
<Need>
<PreserveCase>True</PreserveCase>
</Need>
Actual
$ echo -e '<Need>\n <PreserveCase>True</PreserveCase>\n</Need>' | htmlq Need
<need>
<preservecase>True</preservecase>
</need>
Here the tag <Need>
becomes <need>
and <PreserveCase>
becomes <preservecase>
, which is not what expected.
Possible to preserve the exact case in the tag names? Even behind an option?
Thanks!
HTML tag names are case-insensitive, it's XML that uses a case-sensitive pattern.
A quick search suggests the case conversion may come from html5serve
, hence it's impossible to config it on htmlq
's end. See https://github.com/servo/html5ever/search?q=lowercase.
@muzimuzhi ahh, thank you, that's good to know. Meanwhile I've moved on with yq, which can preserve case properly:
echo -e '<Should><PreserveCase>True</PreserveCase></Should>' | yq -px -ox .Should
Which produces:
<PreserveCase>True</PreserveCase>
Meanwhile I've moved on with yq, which can preserve case properly
Except yq, which assumes the input is standard XML rather than HTML, doesn't properly retain the order of text inside each tag (and it doesn't necessarily output valid html):
htmlq -p a <<<'<a>Order <b>should</b> be <em>preserved</em></a>'
produces
<a>Order <b>should</b> be <em>preserved</em></a>
but
yq -px -ox '.a' <<<'<a>Order <b>should</b> be <em>preserved</em></a>'
produces
<+content>Order</+content>
<+content>be</+content>
<b>should</b>
<em>preserved</em>