math icon indicating copy to clipboard operation
math copied to clipboard

Initial commit of pretty-formatter.

Open jzmaddock opened this issue 5 years ago • 36 comments

This still needs lots of work, but is starting to look useful, initial commit posted here for discussion.

jzmaddock avatar Nov 06 '19 19:11 jzmaddock

Very cool; just went through the docs and got rid of a couple trivial typos.

I'm not sure that this is high-quality feedback, because I'm so out of touch with text streaming, but looking through the API synopsis didn't give me much info on how to use the class. I'm wondering if a super basic a MWE front and center in the docs would be useful? (This is somewhat taken care of by the examples at the end.)

(Also, forgot to add [CI SKIP]; that build can be cancelled.)

Also, any plans for unicode subscripts and superscripts? Might be required for pretty-printing polynomials. Also it would make the text_output very useful for debugging: imagine the Chebyshev transform ostream operator spitting out a reasonably formatted Chebyshev series with unicode subscripts and superscripts. Other overloads for the ostream operator that could be really useful are the barycentric_rational interpolator, the cardinal_cubic_b_spline interpolator, the trigonometric interpolator, and the polynomials.

Got perhaps a good warning from g++-9:

$ make example/formatter_text_output.x
../../boost/math/tools/formatting.hpp:134:91: warning: 'strlen' argument missing terminating nul [-Wstringop-overflow=]
  134 |                typename std::basic_string<charT, Traits>::size_type pos = s.find_first_of(exponent_string);
      |                                                                                           ^~~~~~~~~~~~~~~
../../boost/math/tools/formatting.hpp:128:35: note: referenced argument declared here
  128 |                static const charT exponent_string[2] = { 'e', 'E' };
      |                                   ^~~~~~~~~~~~~~~

Next, ran it with -fsanitize=address -fsanitize=undefined:

../../boost/math/tools/formatting.hpp:160:23: runtime error: load of value 1651076143, which is not a valid value for type 'imaginary_i_t'

NAThompson avatar Nov 10 '19 16:11 NAThompson

Very cool; just went through the docs and got rid of a couple trivial typos.

Thanks!

I'm not sure that this is high-quality feedback, because I'm so out of touch with text streaming, but looking through the API synopsis didn't give me much info on how to use the class. I'm wondering if a super basic a MWE front and center in the docs would be useful? (This is somewhat taken care of by the examples at the end.)

Yes for sure, I just haven't got there yet. But basically, almost whatever you can do to an ostream, you can do to a pretty printer instance.

Also, any plans for unicode subscripts and superscripts? Might be required for pretty-printing polynomials. Also it would make the text_output very useful for debugging: imagine the Chebyshev transform ostream operator spitting out a reasonably formatted Chebyshev series with unicode subscripts and superscripts. Other overloads for the ostream operator that could be really useful are the barycentric_rational interpolator, the cardinal_cubic_b_spline interpolator, the trigonometric interpolator, and the polynomials.

I confess I hadn't realised Unicode could do that - I need to look into that, and yes spit out UTF8.

Got perhaps a good warning from g++-9

Yup, a bug, will fix.

Working on polynomials now BTW: I suspect as I go along many internals will get re-written somewhat. At the moment this is more of a sketch of an idea than fully fledged, but I think it's getting there...

jzmaddock avatar Nov 10 '19 17:11 jzmaddock

Latest version now supports polynomial types plus Unicode plain text output (though still some things to double check).

Chebyshev transform I can see how to support, but the barycentric_rational, cardinal_cubic_b_spline and trigonometric interpolators I don't see what should be printed?

jzmaddock avatar Nov 13 '19 18:11 jzmaddock

Latest version now supports polynomial types plus Unicode plain text output (though still some things to double check).

Just ran it on my machine; looks great! BTW, the docs don't show how to initialize the polynomial (example/formatting_snips.cpp excises the initialization); was this intentional?

Barycentric rational might be harder than I had thought, the sum over a_k/(x-x_k) really looks quite challenging without LaTeX support.

The trigonometric interpolator might be able to work just like the polynomial; just take z = e^{i\theta} and don't bother to expand it into sines and cosines. That's the easy way, since the internal rep is a complex polynomial. Maybe it'd be useful to spit out sines and cosines as well.

Cubic B-spline: 3.4B₃(0.01(x-1)) + 8.7B₃(0.01(x-2)) +

I'll try to take a look at these.

NAThompson avatar Nov 14 '19 16:11 NAThompson

BTW, the unicode superscripts for n>=10 look a little bizarre on my console, but perfectly fine on Firefox:

 ./main.x
(2.25 - 3.5i) - (12.5 + 4.5i)x + 23.34x² + 34.5ix³ + (2.25 - 3.5i)x⁴ - (12.5 + 4.5i)x⁵ + 23.34x⁶ + 34.5ix⁶ + (2.25 - 3.5i)x⁸ - (12.5 + 4.5i)x⁹ + 23.34x¹⁰ + 34.5ix¹¹ + (2.25 - 3.5i)x¹² - (12.5 + 4.5i)x¹³ + 23.34x¹⁴ + 34.5ix¹⁵ + (2.25 - 3.5i)x¹⁶ - (12.5 + 4.5i)x¹⁶ + 23.34x¹⁸ + 34.5ix¹⁹ + (2.25 - 3.5i)x²⁰ - (12.5 + 4.5i)x²¹ + 23.34x²² + 34.5ix²³

How do they look on windows terminal?

NAThompson avatar Nov 14 '19 17:11 NAThompson

Every app I have tried has worked except Notepad++ where I suspect that a font change will make the superscripts about 3 work rather than a box. All browsers, edge Chrome Firefox, Opera, Word, Excel, Outllok Onenote, and Textpad all work fine.

So - very pretty!

pabristow avatar Nov 15 '19 09:11 pabristow

On Linux I see the superscripts are rather spaced out, likewise on Cygwin:

cygwin

I believe this is an artefact of the console using a fixed width font, so although the number are shrunk down in size, their spacing remains the same. Firefox is likely using a regular variable width font and putting the superscripted numbers closers together.

MSVC console is a whole other beast, and doesn't really support Unicode out (at least not be default).

jzmaddock avatar Nov 16 '19 18:11 jzmaddock

Yup you're right: It's a monospaced font problem.

Quick question: The docs show a zero before the 6 on the polynomial formatting, namely 10^-06 rather than 10^-6, is this intended?

polynomial_format

I'm also getting a bizarre tilde spat out after each infinity on zsh:

bizarre_tilde

(I think this might be a bug in the console; when I highlighted it to copy-past, it disappeared. I think we might discover a lot of unicode-related bugs with this.)

NAThompson avatar Nov 16 '19 18:11 NAThompson

Quick question: The docs show a zero before the 6 on the polynomial formatting, namely 10^-06 rather than 10^-6, is this intended?

Sort of: the intention is to not mess with the underlying iostreams formatting, I could strip leading zeros from the exponent if it's going to look better (I suspect it will).

I'm also getting a bizarre tilde spat out after each infinity on zsh:

It's supposed to be a complex-infinity: the infinity symbol with a tilde above, so it uses the combining mark to produce that. Cygwin mucks it up s well:

cygwin2

As do other text editors, so maybe this wasn't such a hot idea :(

HTML renders the same Unicode sequence correctly though:

html

jzmaddock avatar Nov 16 '19 19:11 jzmaddock

How about \mathbb{C}(\infty) for complex infinity? There is also a unicode blackboard bold. (Is there any end to the scope of unicode?)

NAThompson avatar Nov 16 '19 20:11 NAThompson

Similarly mingw and git bash (MS DOSbox). I didn't try this assuming (correctly!) it would not work.

This would work OK if the Unicode page provided a full set of superscript numbers 0 to 9. 0 to 3 are done differently, so on all these systems, the numers 4 to 9 don't display - you get a box or something instead.

" The most common superscript digits (1, 2, and 3) were in ISO-8859-1 and were therefore carried over into those positions in the Latin-1 range of Unicode. " Tough!

PS monospacing means the display is as expected - less pretty.

pabristow avatar Nov 17 '19 09:11 pabristow

How about \mathbb{C}(\infty) for complex infinity?

That looks dangerously close to the Riemann sphere as in https://en.wikipedia.org/wiki/Riemann_sphere. The same text just uses \infty for complex-infinity, so maybe this should do the same for plain text output? Putting a tilde on top is a Mathematica-ism BTW, I haven't been able to find any "official" guidance on how complex infinity should be typeset.

Similarly mingw and git bash (MS DOSbox).

There is supposed to be a way to make this work on a Windows console - cygwin manages it after all, it's "just" a question of getting the right code-page loaded.

jzmaddock avatar Nov 17 '19 16:11 jzmaddock

That looks dangerously close to the Riemann sphere

Yup, that's the goal! I remember my complex analysis class demonstrating the automorphism between the sphere and the plane; the takeaway being "there is only one complex infinity".

The same text just uses \infty for complex-infinity, so maybe this should do the same for plain text output?

I think that is a perfectly sensible option. I actually like Mathematica's notation here; sadly non-LaTeX mathematical text rendering is still in the stone-age.

In any case, the more I think about it, the more I feel like pretty-printing is a hugely important task for the library. For instance, reading through the docs to find out what scaling convention is used for some object is actually pretty challenging; way easier just to print it.

NAThompson avatar Nov 17 '19 17:11 NAThompson

@pabristow : Printing to the windows console works (mostly) fine if you add:

SetConsoleOutputCP(CP_UTF8);

At the start of main. There are a couple of (seldom used) symbols that seem to be unsupported - maybe this is fixable by changing the console font (or not), but all the defaults work fine including all the super/sub scripts.

jzmaddock avatar Nov 17 '19 18:11 jzmaddock

I've managed to get a really bush-league implementation of the pretty-printer working for the cardinal_cubic_b_spline, but I can't quite see how you're avoiding adding code directly into the classes and instead only using the tools/formatting.hpp file; I needed a friend declaration to access the private members.

In any case, here's the code:

boost/math/interpolators/detail/cardinal_cubic_b_spline_detail.hpp:

    friend std::ostream& operator<<(std::ostream& os, const cardinal_cubic_b_spline_imp<Real>& spline) {
        os << spline.m_avg << "+";
        for (int64_t k = 0; k < int64_t(spline.m_beta.size()); ++k) {
            os << spline.m_beta[k] << "×B₃(" << spline.m_h_inv << "(x-" << spline.m_a << ") + " << 1 - k << ") + ";
        }
        return os;
    }

(Obviously haven't implemented the lookahead to see if I should put a + or - in front of each summand.)

boost/math/interpolators/cardinal_cubic_b_spline.hpp:

    friend std::ostream& operator<<(std::ostream& os, const cardinal_cubic_b_spline<Real>& spline) {
        os << *spline.m_imp;
        return os;
    }

Currently looks like this:

32+-33×B₃(10(x-5) + 1) + -27×B₃(10(x-5) + 0) + -21×B₃(10(x-5) + -1) + -15×B₃(10(x-5) + -2) + -9×B₃(10(x-5) + -3) + -3×B₃(10(x-5) + -4) + 3×B₃(10(x-5) + -5) + 9×B₃(10(x-5) + -6) + 15×B₃(10(x-5) + -7) + 21×B₃(10(x-5) + -8) + 27×B₃(10(x-5) + -9) + 33×B₃(10(x-5) + -10) + %  

but I hope to get it to:

32 - 33×B₃(10(x-5) + 1) - 27×B₃(10(x-5)) - 21×B₃(10(x-5) - 1) - 15×B₃(10(x-5) - 2) - 9×B₃(10(x-5) - 3) - 3×B₃(10(x-5) - 4) + 3×B₃(10(x-5) - 5) + 9×B₃(10(x-5) - 6) + 15×B₃(10(x-5) - 7) + 21×B₃(10(x-5) - 8) + 27×B₃(10(x-5) - 9) + 33×B₃(10(x-5) - 10)

Test program:

main.cpp:

#include <iostream>
#include <vector>
#include <boost/math/interpolators/cardinal_cubic_b_spline.hpp>

int main() {
    std::vector<double> v(10);
    for (size_t i = 0; i < v.size(); ++i)
    {
        v[i] = 5 + 6*i;
    }

    double step = 0.1;
    double a = 5;
    boost::math::interpolators::cardinal_cubic_b_spline<double> spline(v.data(), v.size(), a, step);
    std::cout << spline;
}

NAThompson avatar Nov 17 '19 18:11 NAThompson

I've managed to get a really bush-league implementation of the pretty-printer working for the cardinal_cubic_b_spline, but I can't quite see how you're avoiding adding code directly into the classes and instead only using the tools/formatting.hpp file.

The issue here is that formatter.hpp relies on public API's of the types being formatted, but the interpolators don't expose their internals. Conversely, the formatters have some very ad-hoc code in them - which is fine if you just want to format the basic number types - but types which are really general purpose expressions then get a lot more difficult.

So I guess the question is whether the formatters should be re-worked into something more like an expression-printer complete with symbolic representations of arbitrary expressions?

It's certainly do-able, probably very useful and logical, but I confess way more work than I thought I was taking on ;)

jzmaddock avatar Nov 17 '19 19:11 jzmaddock

So I guess the question is whether the formatters should be re-worked into something more like an expression-printer complete with symbolic representations of arbitrary expressions?

Would it be less work to just do a collection of one-offs for each interpolator/object we want to print? Obviously it's not ideal but nonetheless it could still work if we all decided to build this expertise together.

NAThompson avatar Nov 17 '19 19:11 NAThompson

Would it be less work to just do a collection of one-offs for each interpolator/object we want to print?

I think so.... what if I do a bit of refactoring and expose a public API for printing various symbols plus super/subscripts etc? Then outputting a "special" type would just be a sequence of function calls to format the various components?

jzmaddock avatar Nov 18 '19 08:11 jzmaddock

I also note that the code page can be set in the registry (65001 for UTF-8) but I haven't dared try this yet. Computer\HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage

May have other side effects?

pabristow avatar Nov 18 '19 10:11 pabristow

what if I do a bit of refactoring and expose a public API for printing various symbols plus super/subscripts etc?

I like this idea; and will test it out with the B-spline once it's ready.

NAThompson avatar Nov 18 '19 13:11 NAThompson

OK docs for the formatting API were done in a tad of a hurry, but the basics are all there I hope.

Implemented cubic B spline printing using the API (but feel free to change).

Plain text output:

32 - 33×B₃(10(x - 5) + 1) - 27×B₃(10(x - 5)) - 21×B₃(10(x - 5) - 1) - 15×B₃(10(x - 5) - 2) - 9×B₃(10(x - 5) - 3) - 3×B₃(10(x - 5) - 4) + 3×B₃(10(x - 5) - 5) + 9×B₃(10(x - 5) - 6) + 15×B₃(10(x - 5) - 7) + 21×B₃(10(x - 5) - 8) + 27×B₃(10(x - 5) - 9) + 33×B₃(10(x - 5) - 10)

HTML output:

32 - 33×B3(10(<I>x</I> - 5) + 1) - 27×B3(10(<I>x</I> - 5)) - 21×B3(10(<I>x</I> - 5) - 1) - 15×B3(10(<I>x</I> - 5) - 2) - 9×B3(10(<I>x</I> - 5) - 3) - 3×B3(10(<I>x</I> - 5) - 4) + 3×B3(10(<I>x</I> - 5) - 5) + 9×B3(10(<I>x</I> - 5) - 6) + 15×B3(10(<I>x</I> - 5) - 7) + 21×B3(10(<I>x</I> - 5) - 8) + 27×B3(10(<I>x</I> - 5) - 9) + 33×B3(10(<I>x</I> - 5) - 10)

jzmaddock avatar Nov 23 '19 19:11 jzmaddock

Looks awesome! One quick comment with the B-spline: Do you think it would like better with your 'cdot', namely:

32 - 33·B₃(10(x - 5) + 1) - 27·B₃(10(x - 5)) - 21·B₃(10(x - 5) - 1) - 15·B₃(10(x - 5) - 2) - 9·B₃(10(x - 5) - 3) ...

Also, just ran it on Mac, read through the docs, built the examples with sanitizers; no problems on my end.

Also, not sure if this is something to worry about, but when compiling with sanitizers it seems to take quite a while:

$ time make DEBUG=1 example/formatter_text_output.x
g++-9 -g --std=c++17 -ffast-math -march=native -Wfatal-errors -MMD -fvisibility=hidden -O2 -fsanitize=address -fsanitize=undefined -I../../ -I/usr/local/include -o example/formatter_text_output.x example/formatter_text_output.cpp -pthread -L/usr/local/lib
52.41s user 2.24s system 77% cpu 1:10.83 total

Without the sanitizers it take ~11 seconds, which seems fine.

NAThompson avatar Nov 23 '19 23:11 NAThompson

One quick comment with the B-spline: Do you think it would like better with your 'cdot', namely

Perhaps... but we need to pick one as the default, after that the user can choose which they prefer by using the manipulators?

Also, not sure if this is something to worry about, but when compiling with sanitizers it seems to take quite a while

Yikes, that is a long time! The file does instantiate a heck of a lot of stuff though, including some multiprecision types. However if I restrict it to integer and floating point types, and remove the multiprecision stuff, the compile time drops to ~ 1s for me even with the sanitizers on.

jzmaddock avatar Nov 24 '19 08:11 jzmaddock

The long compile time is a gcc artifact, I see:

time g++-9 -g --std=c++17  -Wfatal-errors -fvisibility=hidden -O2 -fsanitize=address -fsanitize=undefined -I../../..  formatter_text_output.cpp -pthread
formatter_text_output.cpp: In function ‘void print(std::basic_ostream<charT>&) [with charT = char]’:
formatter_text_output.cpp:14:6: note: variable tracking size limit exceeded with ‘-fvar-tracking-assignments’, retrying without
   14 | void print(std::basic_ostream<charT>& os)
      |      ^~~~~

real	0m40.283s
user	0m39.380s
sys	0m0.794s

But if I remove the -O2 -ffast-math I get:

time g++-9 -g --std=c++17  -Wfatal-errors -fvisibility=hidden  -fsanitize=address -fsanitize=undefined -I../../..  formatter_text_output.cpp -pthread

real	0m6.639s
user	0m6.307s
sys	0m0.310s

Things are similarly improved if I remove the -g but keep the optimizations, or indeed if I just use clang.

jzmaddock avatar Nov 24 '19 09:11 jzmaddock

Perhaps... but we need to pick one as the default, after that the user can choose which they prefer by using the manipulators?

Yeah, just looking at it again makes me realize that this is less of an unambiguous win than I thought it was. Sorry for the bikeshedding; keep the eye on the prize: This allows users to easily verify that their mental representation of their object is the same as ours.

Things are similarly improved if I remove the -g but keep the optimizations, or indeed if I just use clang.

Yeah, gcc's sanitizer suite is not quite as good as clangs. Dunno if it's worth a bugzilla or not . . .

NAThompson avatar Nov 24 '19 15:11 NAThompson

Ugh looks like you’ve probably uncovered a bug in the quadratic and quintic B spline interpolators, it’s spitting out nans

NAThompson avatar Nov 30 '19 22:11 NAThompson

Ugh looks like you’ve probably uncovered a bug in the quadratic and quintic B spline interpolators, it’s spitting out nans

The (very basic) example is OK for me, or is this with something else?

BTW can you check that I've documented the splines correctly - particularly how many points they include when interpolating? Thanks!

jzmaddock avatar Dec 01 '19 11:12 jzmaddock

Here's what I get on Mac OS, zsh:

➜  math git:(formatting) ✗ make example/formatter_b_spline_output.x
g++-9 -g --std=c++17 -ffast-math -march=native -Wfatal-errors -MMD -fvisibility=hidden -O3 -I../../ -I/usr/local/include -o example/formatter_b_spline_output.x example/formatter_b_spline_output.cpp -pthread  -L/usr/local/lib
➜  math git:(formatting) ✗ ./example/formatter_b_spline_output.x
Quadratic B Spline:

nan×B₂(10(x - 5) + 1) + nan×B₂(10(x - 5)) + nan×B₂(10(x - 5) - 1) + nan×B₂(10(x - 5) - 2) + nan×B₂(10(x - 5) - 3) + nan×B₂(10(x - 5) - 4) + nan×B₂(10(x - 5) - 5) + nan×B₂(10(x - 5) - 6) + nan×B₂(10(x - 5) - 7) + nan×B₂(10(x - 5) - 8) + nan×B₂(10(x - 5) - 9) + nan×B₂(10(x - 5) - 10)

$nan\times\mathrm{B}_{2}(10(x - 5) + 1) + nan\times\mathrm{B}_{2}(10(x - 5)) + nan\times\mathrm{B}_{2}(10(x - 5) - 1) + nan\times\mathrm{B}_{2}(10(x - 5) - 2) + nan\times\mathrm{B}_{2}(10(x - 5) - 3) + nan\times\mathrm{B}_{2}(10(x - 5) - 4) + nan\times\mathrm{B}_{2}(10(x - 5) - 5) + nan\times\mathrm{B}_{2}(10(x - 5) - 6) + nan\times\mathrm{B}_{2}(10(x - 5) - 7) + nan\times\mathrm{B}_{2}(10(x - 5) - 8) + nan\times\mathrm{B}_{2}(10(x - 5) - 9) + nan\times\mathrm{B}_{2}(10(x - 5) - 10)$

<span class="number"><span class="float">nan</span></span>&#xd7;B<sub>2</sub>(<span class="number"><span class="float">10</span></span>(<I>x</I> - <span class="number"><span class="float">5</span></span>) + <span class="number"><span class="integer">1</span></span>) + <span class="number"><span class="float">nan</span></span>&#xd7;B<sub>2</sub>(<span class="number"><span class="float">10</span></span>(<I>x</I> - <span class="number"><span class="float">5</span></span>)) + <span class="number"><span class="float">nan</span></span>&#xd7;B<sub>2</sub>(<span class="number"><span class="float">10</span></span>(<I>x</I> - <span class="number"><span class="float">5</span></span>) - <span c ...

Same thing with the quintic B-spline.

Update: I figured this one out, my fault: It's the -ffast-math; the B-splines use nan to indicate that the user wants to estimate the derivatives. So I added -ffast-math -fno-finite-math-only and it was fine.

NAThompson avatar Dec 01 '19 14:12 NAThompson

Your comment is:

Formatted output of this type is primarily for debugging purposes since only 4 consecutive values of the sum are used for any given abscissa value, where as the printed form shows all the value of the sum.

I would state it differently: "Despite appearances, only a few terms of the sum are nonzero, since the basis function B_n has compact support."

The actual number of non-zero terms for a given abscissa is fairly complicated to derive.

NAThompson avatar Dec 01 '19 14:12 NAThompson

Hmm, on cygwin I see:

-B₂(10(x - 5) + 1) + 5×B₂(10(x - 5)) + 11×B₂(10(x - 5) - 1) + 17×B₂(10(x - 5) - 2) + 23×B₂(10(x - 5) - 3) + 29×B₂(10(x - 5) - 4) + 35×B₂(10(x - 5) - 5) + 41×B₂(10(x - 5) - 6) + 47×B₂(10(x - 5) - 7) + 53×B₂(10(x - 5) - 8) + 59×B₂(10(x - 5) - 9) + 65×B₂(10(x - 5) - 10)

For the quartic, and for the quintic:

-7×B₅(10(x - 5) + 2) - B₅(10(x - 5) + 3) + 5×B₅(10(x - 5) + 4) + 11×B₅(10(x - 5) + 5) + 17×B₅(10(x - 5) + 6) + 23×B₅(10(x - 5) + 7) + 29×B₅(10(x - 5) + 8) + 35×B₅(10(x - 5) + 9) + 41×B₅(10(x - 5) + 10) + 47×B₅(10(x - 5) + 11) + 53×B₅(10(x - 5) + 12) + 59×B₅(10(x - 5) + 13) + 65×B₅(10(x - 5) + 14) + 71×B₅(10(x - 5) + 15)

jzmaddock avatar Dec 01 '19 17:12 jzmaddock