cocalc icon indicating copy to clipboard operation
cocalc copied to clipboard

autoformatting of xml (or at least ptr = pretext) files is horrible

Open williamstein opened this issue 2 years ago • 1 comments

Similar to https://github.com/sagemathinc/cocalc/issues/5472, I just tried out autoformatter on a pretext (xml) source document, and it was terrifying.

  1. Started with this file: https://github.com/PreTeXtBook/pretext/blob/dev/examples/minimal/source/main.ptx
  2. Ran the formatter
  3. Yikes.

All vertical whitespace is gone, which is horrible. Also, even worse, there's a block like this in the input

             <sage>
                <input>
                A = matrix(4,5, srange(20))
                A.rref()
                </input>
                <output>
                [ 1  0 -1 -2 -3]
                [ 0  1  2  3  4]
                [ 0  0  0  0  0]
                [ 0  0  0  0  0]
                </output>
            </sage>

and it is turned into this:

      <sage>
        <input>A = matrix(4,5, srange(20)) A.rref()</input>
        <output>[ 1 0 -1 -2 -3] [ 0 1 2 3 4] [ 0 0 0 0 0] [ 0 0 0 0 0]</output>
      </sage>

The problem is that the input A = matrix(4,5, srange(20)) A.rref() to Sage (or Python) is a syntax error.

I think tidy is being used (just like for #5472) which is causing this. I don't know if the parameters of tidy can be changed to not reflow content of leaf nodes.

There is a prettier plugin for xml now at https://github.com/prettier/plugin-xml and with the default options it does NOT mangle the above. It does have a non-default option --xml-whitespace-sensitivity=ignore that does mangle the above, and maybe the tidy config dos the same.

In any case, I think we should just switch to using prettier with this xml plugin. Then we can have no dependence on tidy.

williamstein avatar May 27 '22 17:05 williamstein

@davidfarmer goes one better, with a PreTeXt specific tool. Not general-purpose, so you won't want it.

https://github.com/davidfarmer/LaTeXtoLaTeX

Our output HTML is really bad, the XSL serializer has a mind of its own.

rbeezer avatar May 27 '22 19:05 rbeezer