flint icon indicating copy to clipboard operation
flint copied to clipboard

arb_get_str() & arb_set_str() does not preserve value

Open saraedum opened this issue 3 years ago • 13 comments

Consider the following C program.

$ cat roundtrip.c
#include <arb.h>

int main() {
  arb_t a;
  arb_init(a);

  arb_load_str(a, "2bb573849d73 -2c 800031b 365d");

  arb_printn(a, 1024, ARB_STR_MORE);

  arb_set_str(a, arb_get_str(a, 1024, ARB_STR_MORE), 1024);

  arb_printn(a, 1024, ARB_STR_MORE);
}

When run (with Arb 2.22.1 from conda-forge) it prints:

[+/- 3.65e+4197][+/- 3.66e+4197]

I understand that arb_get_str() can introduce imprecision and I also understand that I should probably use arb_dump_str to serialize an arb_t; but shouldn't arb_set_str() preserve the radius?

saraedum avatar Mar 31 '22 16:03 saraedum

Neither arb_get_str nor arb_set_str says in their documentation that they preserve the radius. It only states that the old interval is contained in the new interval.

albinahlback avatar Mar 31 '22 16:03 albinahlback

@albinahlback Thanks, I am aware of that. I am not saying that the implementation breaks the documentation.

saraedum avatar Mar 31 '22 16:03 saraedum

Then what is your issue? You asked if arb_set_str preserves the radius, for which I gave an answer.

albinahlback avatar Mar 31 '22 16:03 albinahlback

My question was: Shouldn't arb_set_str() try to preserve the radius here?

saraedum avatar Mar 31 '22 16:03 saraedum

Note that this is somewhat related to flintlib/arb#391. In flintlib/arb#391 arb_get_str introduces an error, here arb_set_str introduces an error.

I wonder if both methods could be more careful when it's possible not to introduce any error.

saraedum avatar Mar 31 '22 16:03 saraedum

I have been trying to get this right for a while, testing various approaches and they all end like this:

using Serialization, ArbNumerics using Serialization: serialize_type using ArbNumerics: @libarb, @libflint

function Serialization.serialize(s::AbstractSerializer, x::ArbFloat) Serialization.serialize_type(s, typeof(x)) unsafestr = ccall(@libarb(arf_dump_str), Cstring, (Ref{ArbFloat},), x) len = Int64(ccall(:strlen, Csize_t, (Cstring,), unsafestr)) + 1 write(s.io, len) unsafe_write(s.io, pointer(unsafestr), len) ccall(@libflint(flint_free), Cvoid, (Cstring,), unsafestr) return nothing end

function Serialization.deserialize(s::AbstractSerializer, ::Type{ArbFloat{P}}) where {P} len = read(s.io, Int64)::Int64 str = Serialization.deserialize_string(s, Int(len)) # TODO: call arb_load_str to constuct arb_t from str x = ArbFloat{P}(0) ccall(@libarb(arf_load_str), Int, (Ref{ArbFloat{P}}, Ref{Cstring}), x, str) return x end

low_precision = 53 high_precision = 320

setworkingprecision(low_precision)

arf_val = log(ArbFloat(17))

2.83321334405621608

serialize_buffer = Serializer(IOBuffer())

Serializer{IOBuffer}(IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false,

size=0, maxsize=Inf, ptr=1, mark=-1), 0, IdDict{Any, Any}(),

Int64[], Dict{UInt64, Any}(), 26)

serialize(serialize_buffer, arf_val)

Serializer{IOBuffer}(IOBuffer(data=UInt8[...], readable=true, writable=true, seekable=true, append=false,

size=82, maxsize=Inf, ptr=83, mark=-1), 3, IdDict{Any, Any}(:ArbNumerics => 2, ArbFloat{64} => 0, :ArbFloat => 1),

Int64[], Dict{UInt64, Any}(), 26)

arf_val_recovered = deserialize(serialize_buffer, typeof(arf_val)) ERROR: EOFError: read end of file

JeffreySarnoff avatar Jan 28 '25 22:01 JeffreySarnoff

@fredrik-johansson can you transfer this issue to FLINT?

albinahlback avatar Jan 28 '25 23:01 albinahlback

@JeffreySarnoff can you provide a MWE in C for your issue? Also, please use code blocks to format the code.

albinahlback avatar Jan 29 '25 00:01 albinahlback

@saraedum So the issue is that when you use string methods, you convert between a binary and a decimal representation on an inexact ring. This forces a loss of information.

albinahlback avatar Jan 29 '25 00:01 albinahlback

@saraedum So the issue is that when you use string methods, you convert between a binary and a decimal representation on an inexact ring. This forces a loss of information.

It doesn't force a loss of information, though. It is true that you can't exactly represent one with the other, but with the additional context of the precision you can still keep all the same information. In particular, with ARB_STR_MORE the information in the midpoint is already adequately preserved; preserving the radius is what's at issue here.

d0sboots avatar Jul 08 '25 06:07 d0sboots

Here is an example showing one of the obstacles here. arb_set_str doesn't set the correctly rounded result, so (by necessity) it has to inflate the error bounds when setting. It is usually impossible to set a number with an exact bound with this interface.

#include <stdio.h>
#include <stdlib.h>
#include "flint/arb.h"

int main() {
  arb_t x, x2;
  double y;
  arb_init(x);
  arb_init(x2);
  arb_set_str(x, "1.1", 53);
  arb_set_str(x2, "1.1", 100);
  y = strtod("1.1", NULL);
  printf("Value: ");
  arb_printn(x, 20, ARB_STR_MORE);
  printf("\nTrue precision: %g\nDouble value: %.20g\nArb conversion error: ", mag_get_d(arb_radref(x)), y);
  mag_set_ui(arb_radref(x), 0);
  arb_sub(x, x2, x, 100);
  arb_printn(x, 7, ARB_STR_NO_RADIUS);
  printf("\nDouble conversion error: ");
  arb_set_d(x, y);
  arb_sub(x, x2, x, 100);
  arb_printn(x, 7, ARB_STR_NO_RADIUS);
  printf("\n");
  arb_clear(x);
  arb_clear(x2);
  return 0;
}
Value: [1.0999999999999998668 +/- 2.23e-16]
True precision: 2.22045e-16
Double value: 1.1000000000000000888
Arb conversion error: 1.332268e-16
Double conversion error: -8.881784e-17

d0sboots avatar Jul 08 '25 07:07 d0sboots

It's impossible to support all of these simultaneously:

  • The internal representation is binary
  • The printed representation is decimal
  • Decimal to binary conversion preserves intervals
  • Binary to decimal conversion preserves intervals
  • Binary to decimal conversion has bounded output precision
  • Roundtrip is exact

If you want exact roundtrips, you have to give up one of the others. For example, you can make decimal to binary conversion not preserve intervals so that arb("2.22045e-16") gives something like $4503607640442835 / 2^{104}$ (or 2.2204499999999999214662505469145610272812811527946841305691805246169678866863250732421875e-16). But this is going to bite users who want to do interval arithmetic.

The best way to get exact roundtrips currently is to use arb_load_str / arb_dump_str.

We ought to add hex support to arb_get_str and arb_set_str as a more accessible option.

fredrik-johansson avatar Jul 08 '25 09:07 fredrik-johansson

Yes, it is impossible to completely square the circle. There are some improvements that can be made generically, but IMO the best thing is to provide more choice, particularly to avoid breaking existing users. A proposal I was thinking of working on was the flag ARB_STR_EXACT (possibly a combination of ARB_STR_MIDPOINT_EXACT and ARB_STR_RADIUS_EXACT, also better naming suggestions welcome) that outputs the exact radius (to the number of digits required to reconstruct it) and also raises the limit on the number of midpoint digits so that it also has enough precision to reconstruct. This would be similar to printd, except it would be usable in more contexts (sometimes you want a string, not fileio). I think that would still preserve intervals when round-tripping through set_str, at least when done at the same precision, but the inherent decimal value would not be interval preserving.

d0sboots avatar Jul 10 '25 06:07 d0sboots