arb icon indicating copy to clipboard operation
arb copied to clipboard

How to best expose Arb numbers in C++

Open certik opened this issue 9 years ago • 19 comments

We want to have good support for Arb in CSymPy, essentially allow users to use Arb to evaluate expressions, and it will give the result with a given accuracy.

What is the best way to expose the Arb numbers in C++? Should we create some lightweight class, similar to mpz_class in GMP (and that should be part of Arb itself, just like mpz_class is part of GMP itself)? Or should we just create a more heavyweight class like ArbFloat, which would internally have a pointer to the Arb C datastructure, and then allow some operations on that (this class would only be part of CSymPy)?

certik avatar Jul 31 '14 19:07 certik

@fredrik-johansson wrote:

How do you represent rational numbers? You can use an fmpq to a GMP mpq with fmpq_get_mpq.

We use mpq_class from GMP, so this should work.

certik avatar Jul 31 '14 19:07 certik

I guess this depends on how you represent numbers generally in CSymPy. Also, how do you want to manage precision (assuming you want to use operator overloading, and not specify precision for each operation with a function call)? With global state, for each variable, or in some other way?

Note that flint has a rather fancy C++ interface (see fmpzxx.h for example) that does clever things such as eliminating redundant temporaries. It could probably be extended to arb types by someone with adequate knowledge of C++.

fredrik-johansson avatar Jul 31 '14 19:07 fredrik-johansson

The precision should be local to each floating point variable. I think that's the best design.

So it looks like we'll just create the class for that in CSymPy for now, and then later if somebody has time, can create the fancy fast C++ interface like Flint has.

certik avatar Jul 31 '14 22:07 certik

@certik I am writing a C++ wrapper for arb - the intended use would be in piranha, but so far it is pure C++11. It is in a very embryonic stage right now, it uses the arb 2.x interface and it manages precision on a per-variable base (so it adds some extra size wrt a straight arb_t - 56 byte vs 48 byte on typical 64-bit archs I think). Arithmetic operations are carried out with the largest precision among the operands.

I am not planning to do anything about expression templates (the fancy C++ interface Fredrik mentions) though, as in my experience (and for my uses anyway) they do not add much and they tend to mess up the type system.

I was planning to add also Python bindings eventually as well (using Boost.Python).

Maybe it could be nice if we could join forces if our goals and ideas converge :)

bluescarni avatar Aug 14 '14 15:08 bluescarni

@bluescarni yes, that would be awesome! Yes, I am not a fan of the fancy template stuff either. I usually use Cython for Python bindings, but that's not a big deal. Mainly I am interested in the C++ wrappers. Did you start a new project for this? Just put it to github, so that I can send pull requests and reference it to this issue, in case other people would like to contribute as well.

certik avatar Aug 14 '14 16:08 certik

@certik Here you go: https://github.com/bluescarni/arbpp

So a couple of notes:

  • it's just a sketch really so far :)
  • the precision member is used so far in the arithmetic operations (of which only binary plus is implemented) and for printing. Not sure about the latter, but the idea of this new member in my mind is that the user indicates that she's gonna be working with that precision on that variable, so it makes sense to print a number of digits connected to that;
  • there is an implicit cast operator to const arb_struct *, which lets you pass directly an arb class to the C api in read-only mode - kinda neat :)
  • two ways of implementing special functions (e.g., cos() in the current code), either as a member function or as an inline friend function that will be found via ADL when cos() is called unqualified. Not sure which one is better.

Regarding the bindings, I agree on keeping it simply C++ (then we can build separately our own Python bindings).

Let me know what you think!

bluescarni avatar Aug 15 '14 00:08 bluescarni

@certik Another note: I know you care about earlier GCC compatibility, let me know if you have problems compiling with GCC < 4.8 - better catch these things earlier.

bluescarni avatar Aug 15 '14 00:08 bluescarni

@certik It should compile on GCC 4.6.x now.

bluescarni avatar Aug 15 '14 15:08 bluescarni

@certik any comments on the current design? It has the addition operators implemented now, the other arithmetic ops should follow the same pattern.

bluescarni avatar Aug 18 '14 10:08 bluescarni

So far I just tested it, that it compiles. I have to play with it more.

Sent from my mobile phone. On Aug 18, 2014 4:28 AM, "Francesco Biscani" [email protected] wrote:

@certik https://github.com/certik any comments on the current design? It has the addition operators implemented now, the other arithmetic ops should follow the same pattern.

— Reply to this email directly or view it on GitHub https://github.com/fredrik-johansson/arb/issues/22#issuecomment-52475230 .

certik avatar Aug 18 '14 13:08 certik

@fredrik-johansson @certik

The basic arithmetic operations are there now, plus a couple of extra things such as a constructor from string and a user-defined literal.

I am not totally sure if the way of handling the precision is satisfactory though, and I would appreciate some feedback before moving forward. Here follow some comments on the current design.

As it stands, each arb object has an associated precision m_prec member which is intended to be used to specify the desired precision of the operations involving that object. For instance, if one adds two arb objects with different precision values, the result will have (and will be computed with) the precision of the operand with the highest precision.

Additionally, the precision member is used also in the printing routine, on the grounds that the number of decimal digits that you see on the screen are a hint of the internal precision associated to the object.

I think this works ok in general, but I am a bit unsure about how to handle the initial construction of objects. For instance, if one constructs an arb from a double or an integral value, currently the internal arf midpoint value is set exactly to that double/integral value, and the precision member gets set to a global default value (currently 53 bits). This can in principle create a bit of confusion. If for instance one sets the global default precision value to 4 bits, then inits an arb with 2**32 and prints it to screen, a decimal value with far fewer decimal digits than the exact value stored internally is printed.

I am starting to think that it would be better, for the constructor from double/integral, to first construct the object "exactly" and then perform a rounding operation according to the current precision. The only downside would be that one loses the ability of constructing "exactly" an object, but the rounding would anyway preserve the mathematical correctness of the operation.

Anyhow, there are some usage examples of the current interface here if you want to take a look:

https://github.com/bluescarni/arbpp/blob/master/tests/arb.cpp

bluescarni avatar Oct 16 '14 13:10 bluescarni

@bluescarni I think there should not be any global precision default. You should simply specify the precision at the point you construct the object. I would make it so that when you construct from double, the precision is 53 bits (or whatever the double precision is), when you construct from float, it's the float precision and so on. If you construct from integer, then the user needs to specify precision.

I don't understand the "rounding operation" proposal.

I opened a few issues (https://github.com/bluescarni/arbpp/issues/2, https://github.com/bluescarni/arbpp/pull/3, https://github.com/bluescarni/arbpp/issues/4).

certik avatar Oct 16 '14 15:10 certik

@certik This is the kind of algorithm that I had in mind for the generic ctor:

  • the constructor takes two parameters, an input value (floating-point or integral) and a precision in number of bits (possibly defaulting to a fixed value, e.g., 53),
  • initialise the internal arb_t member exactly using the input value and such functions as arb_set_si(), arb_set_arf(), etc.,
  • round the arb_t to the number of precision bits specified as a construction parameter, using the function arb_set_round(),
  • record the number of precision bits as internal member for future use.

Would this make sense to you like this? In this way, the internal precision member is always consistent with the actual value stored in the arb_t.

To give a concrete example, this is how it would work with the proposed approach:

std::cout << arb{123456789} << '\n';

This will print:

(1.2345678900000000e8 +/- 0.0000000000000000)

The internal arb_t gets first initialised exactly to 123456789 and then rounded to the default precision of 53 bits. In this case, 53 bits is enough to represent exactly the integer, so the radius of the ball is zero. This instead:

std::cout << arb{123456789,10} << '\n';

will print:

(1.2334e8 +/- 1.3107e5)

As before, the arb_t gets first initialised exactly to 123456789, then it gets rounded to 10 bits. In this case 10 bits are not enough to represent exactly the value, so the rounding function assigns (I think) the smallest possible radius that includes the original value.

(Thanks for the tickets BTW)

bluescarni avatar Oct 16 '14 23:10 bluescarni

I see, this proposal makes sense. However, what's wrong with the current behavior? Can you show it on the examples you just posted?

certik avatar Oct 16 '14 23:10 certik

@certik In the existing code, the first constructor (the one with no precision specified) will not do any rounding, but the precision will be taken into account when printing the value or while doing mathematical operations on it. So, for instance,

arb{1ul << 60u + 1u}

will init the internal value exactly to 2**60 + 1, but its precision (when printing or when doing arithmetics with it) will be considered to be 53 bits. Indeed, printing that with the current code will give you:

(1.1529215046068470e18 +/- 0.0000000000000000)

This is misleading as the ball has zero radius, but clearly with only 53 bit mantissa we cannot possibly represent exactly 2**60+1. With the new code this becomes:

(1.1529215046068470e18 +/- 2.5600000000000000e2)

which is the value after rounding.

Basically, with the first constructor there is a loophole that allows in some sense to create a disconnect between the stored value and the precision member.

Originally, my idea regarding the precision member was for it to be some kind of "declaration of intent" of the precision with which I would like operations on that object to be carried out. But then things like this started to pop up and now I would be oriented towards a strong consistency between the stored value and the precision member.

(note that none of the examples above is "wrong" mathematically, in one case we are representing an exact value, in the second case a ball that includes the original value)

Does this make any sense?

bluescarni avatar Oct 16 '14 23:10 bluescarni

Hmm, I don't know. Issues like this are a major reason why I decided against putting the precision in numbers. The story is different with operator overloading, of course, where you really have to hide the precision away somewhere.

If you want to put an MPFR-like precision on each variable, then I think you should be consistent about rounding every operation, including assignments.

Assigning doubles and integers without rounding will usually be more efficient. You might want to provide alternative assignment methods, say set_exact(x, y), that changes the precision of x to the maximum of x's previous precision and the minimum precision required to represent y exactly. Presumably, this would mostly be used to create constants (reusing such a variable later when its precision has changed could obviously get confusing).

Another option worth considering would be to have each number point to a mutable context object that defines the current working precision (and perhaps other settings). In fact, I considered passing around context objects on the C level (and fetching the precision from the context object), but decided against it since 99% of the time passing around the precision would be enough. I would consider doing this in a Python wrapper, as it has proved to work quite well in mpmath (especially with the with: statement for context management), but I don't know how well context objects would work in C++. Worst case, context objects are just a euphemism for global state :-)

By the way, there's little point printing the radius with more 3-5 digits as it never has high precision. There are rare cases where you want to know precisely how large it is, but it's usually better to print the upper/lower bounds for the whole interval then.

fredrik-johansson avatar Oct 20 '14 13:10 fredrik-johansson

Hey Fredrik,

thanks for the comments!

I just pushed a fix for the assignment operator, I had indeed forgotten about that and thanks for remembering me about it. Now assignment is formally equivalent to construction from generic.

I like the idea of providing exact setting methods. Actually, I am wondering if it makes sense to have a special value for the precision parameter when constructing arb objects (let's say, prec = 0) that sets the internal precision to a value big enough to represent exactly the input object. Just to be clear, the bits necessary for exact representation would be the mantissa bits for floating-point types and the value bits (excluding sign bits) for integral types?

I had thought about the context idea, but in generally I don't like to introduce a global state. In C++, you can implement a context object as a class that just sets the global precision value on construction and resets it to the previous value on destruction (construction and destruction are deterministic and controlled by scoping the context object). I think in multithreaded mode this could be tricky: should the global precision be a thread-local object or an atomic variable shared by all threads? Right now, the global default precision is a constant, I was planning to turn it into an atomic variable to make its management safe from multiple threads.

Thanks for pointing out the limited precision of mag_t. I think it would be ok to force the printing routine to use only 30 bits?

bluescarni avatar Oct 20 '14 13:10 bluescarni

We built a little C++ wrapper for arb as part of flatsurf/exact-real. The basic layer arb.hpp, arf.hpp just gives you C++-style memory management for arb_t and arf_t. A second layer yap/arb.hpp and yap/arf.hpp provides basic arithmetic. I did not want to fix precisions so a precision has to be specified when evaluating an arithmetic expression:

Arb x,y,z;
z = (x + y)(64);

When no precision is specified the result is just an expression tree that only exists at compile time:

z = x + y; // compile error
z = ((x + y) + z)(64); // ok.

You can see a more extensive demo in our tests.

There is also a Python wrapper that is building on this with cppyy:

a = Arb(1)
b = (a + a)(64)

It's all very far from mature but if somebody likes the idea it might evolve into proper C++/Python bindings for arb.

saraedum avatar May 16 '19 02:05 saraedum

Thanks for sharing! Interesting choice of syntax.

fredrik-johansson avatar May 16 '19 07:05 fredrik-johansson