Feature request: Option to strip/reduce all whitespace, not just in text.
It would be nice to be able to collapse each match into a single line, for further filtering with tools like grep.
For example, when matching table rows, each row often span multiple lines, due to how the html was formatted.
My current workaround is to minify the html before passing it to htmlq (cat myfile.html | sd '\n' ' ' | tr -s ' ' | htmlq ...), but a simple switch in htmlq would make this way easier.
Not sure how this would be handled in tags like pre tough...
@pbsds I guess you mean sed not sd?
Sorry, i'm so used to sd i didn't notice.
cat myfile.html | sed -ze 's/\n/ /g' | tr -s ' ' | htmlq ...
@pbsds Thanks, didn't know about this one, will add it to my toolbelt!
I just ended up using xargs for the whitespace, which seems to beasier for me:
cat myfile.html | htmlq ... | xargs | htmlq ...