metallic
metallic copied to clipboard
parsefloat_ / strtod can't pass special tests
I added extra tests in test/native/stdlib/strtod.c which unfortunately fail:
assert(parsefloat_(".2470328229206232720882843964341106861825299013071623822127928412503377536351044e-323", (void*)0) == 0x1p-1074);
// FAIL, actual: 0
assert(parsefloat_(".7410984687618698162648531893023320585475897039214871466383785237510132609053131e-323", (void*)0) == 0x1p-1074);
// FAIL, actual: 0
assert(parsefloat_(
"17976931348623158079372897140530341507993413271003782693617377898044 \
49682927647509466490179775872070963302864166928879109465555478519404 \
02630657488671505820681908902000708383676273854845817711531764475730 \
27006985557136695962284291481986083493647529271907416844436551070434 \
2711559699508093042880177904174497792", (void*)0) == INFINITY);
// FAIL, actual: 1.7976931348623157e+308
Correctly rounded decimal to binary conversion was not planned, but I am taking this as a feature request. I will work on this after finishing its inverse in vfprintf because
- Of this issue
- I am already working on correctly rounded binary to decimal conversion for
vfprintf - musl implements correctly rounded binary <-> decimal conversion
Current implementation is still conforming though, as
binary -(native libc)-> decimal (DBL_DECIMAL_DIG digits) -(metallic)-> binary
is identity conversion.
Hmm, but it seems it round right in most of cases. Just need fix bounded (lower and upper) cases.
For IEEE-754 double,
DBL_DECIMAL_DIG= 17DBL_DIG= 15
This means the following are identity conversions:
double-> 17 decimal digits ->double- 15 decimal digits ->
double-> 15 decimal digits
Current implementation only ensures that the first 15 digits are done right because I wanted to reduce stack usage, but I changed my mind. :P
Is it required modify existing algorithm or completely reimplement?)
It is almost a rewrite to make it correctly rounded. To avoid double rounding error and to reduce code size, a common strategy is: decimal string -(round to odd)-> intermediate binary -(round to nearest)-> target binary
The intermediate binary format has more significant digits (e.g. 128) than long double (112) in order to reuse the first conversion for all targets (float, double, long double).
Rounding to odd is to round all non-representable numbers to the nearest odd representation, so even representations must be exact. Why this algorithm works is explained in the following study:
Sylvie Boldo, Guillaume Melquiond. “When double rounding is odd”. 17th IMACS World Congress, Jul 2005, Paris, France. pp.11. ⟨inria-00070603v2⟩ (backup)
Sounds reasonable. Currently instead "decimal string -(round to odd)" you just use "set any non-zero bit for out of consuming range" here. But it seems also need 128-bit fixed point arithmetic. I just decided to reuse some parts of your library for parsing floats on our side and really interesting have similar simple (and compact) approach=)
Setting lowest bit when seeing a nonzero digit out of range is effectively rounding to odd, so it produces correctly rounded results if no scaling by a power of 10 is involved.
The plan is to extend parsedec_ like
struct BigFloat
{
unsigned __int128 significand;
int exp;
};
struct BigFloat __parsedec(
const char s[restrict static 1],
char* end[restrict static 1]);
and convert BigFloat to target type in parsedec_. Rounding unsigned __int128 to nearest floating-points is already done by __floatunti[sdt]f.
I'd give __parsedec external linkage to reuse across target types, so it has a reserved name.