perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

Make S_lossless_NV_to_IV consistent with SvIV_please_nomg

Open t-a-k opened this issue 1 year ago • 8 comments

I found that some arithmetic operations involving NV whose absolute value is larger than (1 << NV_PRESERVES_UV_BITS) have some inconsistencies:

% perl -le 'print 0x1p60 + 0'
1.15292150460685e+18
% perl -le 'print 0x1p60 + 0.0'
1152921504606846976

In the former case NV + IV is NV (I think this is correct because 0x1p60 is not accurate integer in 64-bit NV), but in the latter case while NV + NV becomes IV.

Another symptom is that such operation would return different result on the second and later time:

% perl -le 'sub foo { $_[0] + 1 } print foo(1e+18); print foo(1e+18); print foo(1e+18)'
1e+18
1000000000000000001
1000000000000000001
%

This PR will fix these by making S_lossless_NV_to_IV() more consistent with SvIV_please_nomg() which will not implicitly convert such NVs to IVs.

t-a-k avatar Jan 12 '24 15:01 t-a-k

@sisyphus please look

khwilliamson avatar Jan 12 '24 21:01 khwilliamson

I've built this -t-a-k:dont-convert-inaccurate-NV-to-IV branch under a few different config options and found it to fix the 2 test cases presented in this PR (and tested in t/op/numconvert.t.) Furthermore, I've not been able to find any instances where it breaks existing expectations. So I'd have to say that it LGTM. (The demonstrated current behaviour is ridiculous.)

If I'm sounding a little unenthusiastic about this, it's only because (IMO) a perl that doesn't define NV_PRESERVES_UV suffers a design flaw - for which the only satisfactory solution (IMO) is to instead use a perl that does define NV_PRESERVES_UV.

In any case, it seems to me that merging this PR is the right thing to do ... unless someone else can find something I've missed.

sisyphus avatar Jan 13 '24 13:01 sisyphus

It LGTM too, but i will defer merging to give others a chance to chime in

khwilliamson avatar Jan 13 '24 23:01 khwilliamson

The following example (from the OP) puzzled me:

>perl -le "sub foo { $_[0] + 1 } print foo(1e+18); print foo(1e+18); print foo(1e+18)"
1e+18
1000000000000000001
1000000000000000001

I couldn't work out how that behaviour occurred - so I asked about it on perlmonks (https://www.perlmonks.org/?node_id=11157045), thinking that the answer would have more to do with the density between my ears, rather than anything else. But there has been no definitive explanation given there, and the possibility of there being a perl bug has also been raised. So, I'll ask 2 questions here:

  1. How do we explain that behaviour ?
  2. How does the patch proposed by this PR fix the behaviour ?

UPDATE: In the perlmonks thread I presented the issue a little differently as:

perl -le "for(0x1p+60, 0x1p+60, 0x1p+60) { print $_ + 1 }"
1.15292150460685e+18
1152921504606846977
1152921504606846977

Interestingly, with my perl built from the t-a-k:dont-convert-inaccurate-NV-to-IV branch, this slightly different one-liner behaves a little differently from the one-liner demo that @t-a-k provided. With @t-a-k's one liner:

>perl -le "sub foo { $_[0] + 1 } print foo(1e+18); print foo(1e+18); print foo(1e+18)"
1e+18
1e+18
1e+18

With the one-liner I posted on perlmonks:

>perl -le "for(0x1p+60, 0x1p+60, 0x1p+60) { print $_ + 1 }"
1.15292150460685e+18
1.15292150460685e+18
1.15292150460685e+18

sisyphus avatar Jan 18 '24 03:01 sisyphus

If we show the flags on svl and svr at the start of pp_add for the first iteration we see:

svl->sv_flags = 0x8012202 = SVt_NV | SVf_NOK | SVp_NOK | SVf_PROTECT | SVf_READONLY
svr->sv_flags = 0x8011101 = SVt_IV | SVf_IOK | SVp_IOK | SVf_PROTECT | SVf_READONLY

Assuming PERL_PRESERVE_IVUV we then enter the "special-case some simple common cases" block and find that they are neither both IV nor both NV, so drop through to the generic case. It then tries to check if it can treat svl as an IV with SvIV_please_nomg(svl) which returns false, so decides it must use NV arithmetic.

Treating the constant 1 as an NV then caches the result, so for the second operation, the flags on svr have changed:

svr->sv_flags = 0x8013306
 = SVt_PVNV | SVf_IOK | SVf_NOK | SVp_IOK | SVp_NOK | SVf_PROTECT | SVf_READONLY

.. so the second time through it decides they are both NV, sees that they both qualify as lossless_NV_to_IV, so goes to the do_iv handling.

It all seems quite messy. I think I'd hope it would look something more like:

  if (SvIV_please_nomg(svl) && SVIV_please_nomg(svr))
    /* use IV arithmetic */
  else
    /* use NV arithmetic */

.. but I agree that lossless_NV_to_IV should at the very least be consistent with SvIV_please_nomg, so if that's what this commit achieves it's a step forward.

I guess another question is why lossless_NV_to_IV is used here at all: if the conversion is lossless we can presumably cache it and set at least SVp_IOK, so wouldn't it always be preferable to do that?

hvds avatar Jan 18 '24 16:01 hvds

@sisyphus @t-a-k any response to the idea of using SVp_IOK

khwilliamson avatar Feb 08 '24 16:02 khwilliamson

any response to the idea of using SVp_IOK

Sounds sane to me, and I'm happy enough to test any proposed implementation. I have quite a bit of code that examines (and utilizes) the numeric flags of scalars. I think it would provide a chance of catching unforeseen consequences.

But I'm not interested in writing such an implementation. As I hinted above, IMO, the way to deal with issues arising from having a perl whose NV does not preserve all IVs is to use a perl whose NV always does preserve IV. (That is, "solution" by "avoidance".)

sisyphus avatar Feb 09 '24 01:02 sisyphus

any response to the idea of using SVp_IOK

I think that there is a trade-off here because caching IV will require upgrading to PVNV if that SV was a pure NV (as in @hvds's example above). It will also make the "special-case some simple common cases" more complex and eventually make this specialized code not so beneficial IMO. I will think about it a little more anyway...

t-a-k avatar Feb 09 '24 02:02 t-a-k

@t-a-k @hvds Any further thoughts?

khwilliamson avatar Apr 14 '24 14:04 khwilliamson

@t-a-k @hvds Any further thoughts?

Nothing new from me; I'll stick with my earlier comment "I agree that lossless_NV_to_IV should at the very least be consistent with SvIV_please_nomg, so if that's what this commit achieves it's a step forward".

hvds avatar Apr 14 '24 21:04 hvds

Thank you for merging.

I think that extending these "simple common case" to cache converted IV will make it complicated and closer to the generic case, so it might be better yet to remove else if (flags & SVf_NOK) { ... } block and leave it to the generic case. But I think such removal should be another issue (or PR) anyway.

t-a-k avatar Apr 15 '24 17:04 t-a-k