perl5
perl5 copied to clipboard
Make S_lossless_NV_to_IV consistent with SvIV_please_nomg
I found that some arithmetic operations involving NV whose absolute value is larger than (1 << NV_PRESERVES_UV_BITS) have some inconsistencies:
% perl -le 'print 0x1p60 + 0'
1.15292150460685e+18
% perl -le 'print 0x1p60 + 0.0'
1152921504606846976
In the former case NV + IV is NV (I think this is correct because 0x1p60 is not accurate integer in 64-bit NV), but in the latter case while NV + NV becomes IV.
Another symptom is that such operation would return different result on the second and later time:
% perl -le 'sub foo { $_[0] + 1 } print foo(1e+18); print foo(1e+18); print foo(1e+18)'
1e+18
1000000000000000001
1000000000000000001
%
This PR will fix these by making S_lossless_NV_to_IV() more consistent with SvIV_please_nomg() which will not implicitly convert such NVs to IVs.
@sisyphus please look
I've built this -t-a-k:dont-convert-inaccurate-NV-to-IV branch under a few different config options and found it to fix the 2 test cases presented in this PR (and tested in t/op/numconvert.t.)
Furthermore, I've not been able to find any instances where it breaks existing expectations.
So I'd have to say that it LGTM. (The demonstrated current behaviour is ridiculous.)
If I'm sounding a little unenthusiastic about this, it's only because (IMO) a perl that doesn't define NV_PRESERVES_UV suffers a design flaw - for which the only satisfactory solution (IMO) is to instead use a perl that does define NV_PRESERVES_UV.
In any case, it seems to me that merging this PR is the right thing to do ... unless someone else can find something I've missed.
It LGTM too, but i will defer merging to give others a chance to chime in
The following example (from the OP) puzzled me:
>perl -le "sub foo { $_[0] + 1 } print foo(1e+18); print foo(1e+18); print foo(1e+18)"
1e+18
1000000000000000001
1000000000000000001
I couldn't work out how that behaviour occurred - so I asked about it on perlmonks (https://www.perlmonks.org/?node_id=11157045), thinking that the answer would have more to do with the density between my ears, rather than anything else. But there has been no definitive explanation given there, and the possibility of there being a perl bug has also been raised. So, I'll ask 2 questions here:
- How do we explain that behaviour ?
- How does the patch proposed by this PR fix the behaviour ?
UPDATE: In the perlmonks thread I presented the issue a little differently as:
perl -le "for(0x1p+60, 0x1p+60, 0x1p+60) { print $_ + 1 }"
1.15292150460685e+18
1152921504606846977
1152921504606846977
Interestingly, with my perl built from the t-a-k:dont-convert-inaccurate-NV-to-IV branch, this slightly different one-liner behaves a little differently from the one-liner demo that @t-a-k provided.
With @t-a-k's one liner:
>perl -le "sub foo { $_[0] + 1 } print foo(1e+18); print foo(1e+18); print foo(1e+18)"
1e+18
1e+18
1e+18
With the one-liner I posted on perlmonks:
>perl -le "for(0x1p+60, 0x1p+60, 0x1p+60) { print $_ + 1 }"
1.15292150460685e+18
1.15292150460685e+18
1.15292150460685e+18
If we show the flags on svl and svr at the start of pp_add for the first iteration we see:
svl->sv_flags = 0x8012202 = SVt_NV | SVf_NOK | SVp_NOK | SVf_PROTECT | SVf_READONLY
svr->sv_flags = 0x8011101 = SVt_IV | SVf_IOK | SVp_IOK | SVf_PROTECT | SVf_READONLY
Assuming PERL_PRESERVE_IVUV we then enter the "special-case some simple common cases" block and find that they are neither both IV nor both NV, so drop through to the generic case. It then tries to check if it can treat svl as an IV with SvIV_please_nomg(svl) which returns false, so decides it must use NV arithmetic.
Treating the constant 1 as an NV then caches the result, so for the second operation, the flags on svr have changed:
svr->sv_flags = 0x8013306
= SVt_PVNV | SVf_IOK | SVf_NOK | SVp_IOK | SVp_NOK | SVf_PROTECT | SVf_READONLY
.. so the second time through it decides they are both NV, sees that they both qualify as lossless_NV_to_IV, so goes to the do_iv handling.
It all seems quite messy. I think I'd hope it would look something more like:
if (SvIV_please_nomg(svl) && SVIV_please_nomg(svr))
/* use IV arithmetic */
else
/* use NV arithmetic */
.. but I agree that lossless_NV_to_IV should at the very least be consistent with SvIV_please_nomg, so if that's what this commit achieves it's a step forward.
I guess another question is why lossless_NV_to_IV is used here at all: if the conversion is lossless we can presumably cache it and set at least SVp_IOK, so wouldn't it always be preferable to do that?
@sisyphus @t-a-k any response to the idea of using SVp_IOK
any response to the idea of using
SVp_IOK
Sounds sane to me, and I'm happy enough to test any proposed implementation. I have quite a bit of code that examines (and utilizes) the numeric flags of scalars. I think it would provide a chance of catching unforeseen consequences.
But I'm not interested in writing such an implementation. As I hinted above, IMO, the way to deal with issues arising from having a perl whose NV does not preserve all IVs is to use a perl whose NV always does preserve IV. (That is, "solution" by "avoidance".)
any response to the idea of using
SVp_IOK
I think that there is a trade-off here because caching IV will require upgrading to PVNV if that SV was a pure NV (as in @hvds's example above). It will also make the "special-case some simple common cases" more complex and eventually make this specialized code not so beneficial IMO. I will think about it a little more anyway...
@t-a-k @hvds Any further thoughts?
@t-a-k @hvds Any further thoughts?
Nothing new from me; I'll stick with my earlier comment "I agree that lossless_NV_to_IV should at the very least be consistent with SvIV_please_nomg, so if that's what this commit achieves it's a step forward".
Thank you for merging.
I think that extending these "simple common case" to cache converted IV will make it complicated and closer to the generic case, so it might be better yet to remove else if (flags & SVf_NOK) { ... } block and leave it to the generic case. But I think such removal should be another issue (or PR) anyway.