htmlq icon indicating copy to clipboard operation
htmlq copied to clipboard

pretty print as the default

Open carnott-snap opened this issue 4 years ago • 3 comments

Many users coming from jq will expect htmlq can function as a formatter for HTML. Unfortunately htmlq html does not pretty print the output. Can we deprecate the -p flag, setting the default to true, and use an -c flag that aligns with expectations?

In general, it would be nice to make the flags lineup exactly with jq, where possible:

jq - commandline JSON processor [version 1.5-1-a5b5cbe]
Usage: jq [options] <jq filter> [file...]

        jq is a tool for processing JSON inputs, applying the
        given filter to its JSON text inputs and producing the
        filter's results as JSON on standard output.
        The simplest filter is ., which is the identity filter,
        copying jq's input to its output unmodified (except for
        formatting).
        For more advanced filters see the jq(1) manpage ("man jq")
        and/or https://stedolan.github.io/jq

        Some of the options include:
         -c             compact instead of pretty-printed output;
         -n             use `null` as the single input value;
         -e             set the exit status code based on the output;
         -s             read (slurp) all inputs into an array; apply filter to it;
         -r             output raw strings, not JSON texts;
         -R             read raw strings, not JSON texts;
         -C             colorize JSON;
         -M             monochrome (don't colorize JSON);
         -S             sort keys of objects on output;
         --tab  use tabs for indentation;
         --arg a v      set variable $a to value <v>;
         --argjson a v  set variable $a to JSON value <v>;
         --slurpfile a f        set variable $a to an array of JSON texts read from <f>;
        See the manpage for more options.
•   --version:
           Output the jq version and exit with zero.

•   --seq:
           Use the application/json-seq MIME type scheme for separating JSON texts in jq´s input and output. This means that an ASCII RS (record separator) character is printed before each value on output and an ASCII LF (line feed) is printed after every output. Input JSON texts that fail  to  pars
e  are  ignored  (but  warned
           about), discarding all subsequent input until the next RS. This more also parses the output of jq without the --seq option.

•   --stream:
           Parse the input in streaming fashion, outputing arrays of path and leaf values (scalars and empty arrays or empty objects). For example, "a" becomes [[],"a"], and [[],"a",["b"]] becomes [[0],[]], [[1],"a"], and [[1,0],"b"].
           This is useful for processing very large inputs. Use this in conjunction with filtering and the reduce and foreach syntax to reduce large inputs incrementally.

 •   --slurp/-s:
           Instead of running the filter for each JSON object in the input, read the entire input stream into a large array and run the filter just once.

•   --raw-input/-R:
           Don´t parse the input as JSON. Instead, each line of text is passed to the filter as a string. If combined with --slurp, then the entire input is passed to the filter as a single long string.

•   --null-input/-n:
           Don´t read any input at all! Instead, the filter is run once using null as the input. This is useful when using jq as a simple calculator or to construct JSON data from scratch.

•   --compact-output / -c:
           By default, jq pretty-prints JSON output. Using this option will result in more compact output by instead putting each JSON object on a single line.

•   --tab:
           Use a tab for each indentation level instead of two spaces.

•   --indent n:
           Use the given number of spaces (no more than 8) for indentation.

•   --color-output / -C and --monochrome-output / -M:
           By default, jq outputs colored JSON if writing to a terminal. You can force it to produce color even if writing to a pipe or a file using -C, and disable color with -M.

•   --ascii-output / -a:
           jq usually outputs non-ASCII Unicode codepoints as UTF-8, even if the input specified them as escape sequences (like "\u03bc"). Using this option, you can force jq to produce pure ASCII output with every non-ASCII character replaced with the equivalent escape sequence.

•   --unbuffered
           Flush the output after each JSON object is printed (useful if you´re piping a slow data source into jq and piping jq´s output elsewhere).

•   --sort-keys / -S:
           Output the fields of each object with the keys in sorted order.

•   --raw-output / -r:
           With this option, if the filter´s result is a string then it will be written directly to standard output rather than being formatted as a JSON string with quotes. This can be useful for making jq filters talk to non-JSON-based systems.

•   --join-output / -j:
           Like -r but jq won´t print a newline after each output.

•   -f filename / --from-file filename:
           Read filter from the file rather than from a command line, like awk´s -f option. You can also use ´#´ to make comments.

•   -Ldirectory / -L directory:
           Prepend directory to the search list for modules. If this option is used then no builtin search list is used. See the section on modules below.

•   -e / --exit-status:
           Sets  the exit status of jq to 0 if the last output values was neither false nor null, 1 if the last output value was either false or null, or 4 if no valid result was ever produced. Normally jq exits with 2 if there was any usage problem or system error, 3 if there was a jq program compi
le error, or 0 if the jq pro‐
           gram ran.

•   --arg name value:
           This option passes a value to the jq program as a predefined variable. If you run jq with --arg foo bar, then $foo is available in the program and has the value "bar". Note that value will be treated as a string, so --arg foo 123 will bind $foo to "123".

•   --argjson name JSON-text:
           This option passes a JSON-encoded value to the jq program as a predefined variable. If you run jq with --argjson foo 123, then $foo is available in the program and has the value 123.

•   --slurpfile variable-name filename:
           This option reads all the JSON texts in the named file and binds an array of the parsed JSON values to the given global variable. If you run jq with --argfile foo bar, then $foo is available in the program and has an array whose elements correspond to the texts in the file named bar.

       •   --argfile variable-name filename:
           Do not use. Use --slurpfile instead.
           (This option is like --slurpfile, but when the file has just one text, then that is used, else an array of texts is used as in --slurpfile.)

•   --run-tests [filename]:
           Runs the tests in the given file or standard input. This must be the last option given and does not honor all preceding options. The input consists of comment lines, empty lines, and program lines followed by one input line, as many lines of output as are expected (one per output), and a 
terminating empty line.  Com‐
           pilation failure tests start with a line containing only "%%FAIL", then a line containing the program to compile, then a line containing an error message to compare to the actual.

carnott-snap avatar Apr 10 '20 00:04 carnott-snap

I think this would be nice but the pretty print function I've been using is a bit of a work in progress and I'm not sure it improves things in most cases. I want to make the test suite better and I'll revisit this afterwards.

mgdm avatar Sep 07 '21 18:09 mgdm

I'm a frequent jq user, but not a skilled one. Most of the time all I only use it for formatting. I know jq has powerful querying capabilities, but the formatting capability means that I can usually just use grep to meet any search/filter needs I have. That's usually enough. When necessary, very little is beyond a pipeline with multiple greps.

So 5 minutes ago I wanted to apply the same technique to HTML, Googled "like jq but for html", and here we are.

curtcox avatar Nov 07 '21 03:11 curtcox

The tidy command from htacg/tidy-html5 is another option:

 curl -s https://www.w3.org/Provider/Style/URI \
  | htmlq blockquote:first-of-type \
  | tidy -q -f /dev/null -w 78 -i

Not saying it's as terse as what you were hoping for by making pretty-printing default for htmlq, and it does take a bit of doing to suppress its output footer and warnings about non-conformant HTML, but piping through tidy does fall in line with the "one thing well" philosophy.

ernstki avatar Nov 11 '21 02:11 ernstki