htmlq icon indicating copy to clipboard operation
htmlq copied to clipboard

Case sensitiveness: htmlq not preserving case?

Open ryenus opened this issue 2 years ago • 3 comments

Somehow htmlq turns the element, or tag names into lowercase:

Expected

$ echo -e '<Need>\n  <PreserveCase>True</PreserveCase>\n</Need>' | htmlq Need
<Need>
  <PreserveCase>True</PreserveCase>
</Need>

Actual

$ echo -e '<Need>\n  <PreserveCase>True</PreserveCase>\n</Need>' | htmlq Need
<need>
  <preservecase>True</preservecase>
</need>

Here the tag <Need> becomes <need> and <PreserveCase> becomes <preservecase>, which is not what expected. Possible to preserve the exact case in the tag names? Even behind an option?

Thanks!

ryenus avatar Apr 15 '22 23:04 ryenus

HTML tag names are case-insensitive, it's XML that uses a case-sensitive pattern.

A quick search suggests the case conversion may come from html5serve, hence it's impossible to config it on htmlq's end. See https://github.com/servo/html5ever/search?q=lowercase.

muzimuzhi avatar Dec 01 '22 01:12 muzimuzhi

@muzimuzhi ahh, thank you, that's good to know. Meanwhile I've moved on with yq, which can preserve case properly:

echo -e '<Should><PreserveCase>True</PreserveCase></Should>' | yq -px -ox .Should

Which produces:

<PreserveCase>True</PreserveCase>

ryenus avatar Dec 01 '22 05:12 ryenus

Meanwhile I've moved on with yq, which can preserve case properly

Except yq, which assumes the input is standard XML rather than HTML, doesn't properly retain the order of text inside each tag (and it doesn't necessarily output valid html):

htmlq -p a <<<'<a>Order <b>should</b> be <em>preserved</em></a>'

produces

<a>Order <b>should</b> be <em>preserved</em></a>

but

yq -px -ox '.a' <<<'<a>Order <b>should</b> be <em>preserved</em></a>'

produces

<+content>Order</+content>
<+content>be</+content>
<b>should</b>
<em>preserved</em>

baodrate avatar Mar 21 '23 21:03 baodrate