perl5
perl5 copied to clipboard
proof of concept/performance test for use float
This is an attempt at #17813
I tested performance with a simple mandelbrot set generator (on my old CPU):
tony@mars:.../git/perl2$ time ./perl -Ilib ../mandel.pl
real 0m25.752s
user 0m25.612s
sys 0m0.132s
tony@mars:.../git/perl2$ time ./perl -Ilib -Mfeature=float ../mandel.pl
real 0m19.751s
user 0m19.742s
sys 0m0.004s
I see two upvotes - did anyone else try benchmarking this on more useful code?
I ask to see if it's worth developing this further.
I implemented this as a feature, but it doesn't really belong there since it's not a language feature as such, it shouldn't be enabled by a feature version bundle.
I'm hesitant to use a hints bit since we're fairly short on them.
Simply using an entry in %^H
has the same problems that it did for indirect
feature before features were cached in cop_features - we'd be adding a hash lookup for every binop or unop generated.
Maybe it could be implemented as a feature, but not included in the all
feature set, and not documented in feature.pm.
This still seems worthwhile to me, but non of my useful code really uses float math so nothing handy to benchmark.
note that we recovered some hint bits with 5d1739474d967de1ab8a8f88aa5eff250dbc0eab
so maybe it s fine to steal one bit for float
?
I've not tested/benchmarked this on other code.
note that we recovered some hint bits with 5d17394 so maybe it s fine to steal one bit for
float
?
That recovered only a single bit which is now assigned to the feature mask, where it belongs.
Maybe we just need another hints word.
@tonycoz , @richardleach , @atoomic, Can we get an update on the status of this p.r.?
Thank you very much. Jim Keenan
It's waiting on (likely) adding another hints word.
But I think that needs to wait on reducing the cost of COPs which those are embedded into.
Right now a COP is generated for every statement, but the information in each COP typically doesn't change much except for the line number. I've looked at adding an alternative COP which only has a line number, but this will break some backward compatibility at the XS level.
I noticed that the regular versions of these functions do:
+ TARGn(left * right, 0);
+ SETs( TARG );
rather than:
- SETn( left * right );
to try harder to avoid calling sv_setnv_mg.
On Tue, 26 Jan 2021 at 23:32, Tony Cook @.***> wrote:
It's waiting on (likely) adding another hints word.
But I think that needs to wait on reducing the cost of COPs which those are embedded into.
Right now a COP is generated for every statement, but the information in each COP typically doesn't change much except for the line number. I've looked at adding an alternative COP which only has a line number, but this will break some backward compatibility at the XS level.
I'd like to hear more about this as it aligns with my interest in improving the quality of our error messages. If can do any legwork here id be happy to hear an appraisal of the problem to get started with. Just mail me personally. You know where. :-)
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Tue, 26 Jan 2021 at 23:32, Tony Cook @.***> wrote: It's waiting on (likely) adding another hints word. But I think that needs to wait on reducing the cost of COPs which those are embedded into. Right now a COP is generated for every statement, but the information in each COP typically doesn't change much except for the line number. I've looked at adding an alternative COP which only has a line number, but this will break some backward compatibility at the XS level. I'd like to hear more about this as it aligns with my interest in improving the quality of our error messages. If can do any legwork here id be happy to hear an appraisal of the problem to get started with. Just mail me personally. You know where. :-) Yves
I've stalled on this a bit (error: stack overflow), but I did get a "small COP" large implemented and I don't remember getting any crashes. I still needed to update caller() to understand the new COPs.
There may have been other problems though, I wasn't comfortable with the way I was detecting whether a small COP was possible, eg with code like:
line1;
line2;
if (...) { #line3
line4;
line5;
no strict '...';
line7;
}
line9;
line10;
lines 1, 4, 7, 9 needed full COPs, and I hadn't gotten to the point of checking that was happening when it should.
Even without adding a small COP we could improve memory usage a great deal by reference counting cop_warnings, and I think cop_file on threads, these are profligate users of memory - each cop has it's own copy.
Even without adding a small COP we could improve memory usage a great deal by reference counting cop_warnings, and I think cop_file on threads, these are profligate users of memory - each cop has it's own copy.
In theory it should be pretty easy to use PL_strtab to do that if they are write-once. I will take a look. Do you have a branch for your small cop work?
Do you have a branch for your small cop work?
It's very hacky and incomplete (and probably just plain broken), but https://github.com/Perl/perl5/tree/tonyc/less-cop
On Thu, 27 Oct 2022 at 00:26, Tony Cook @.***> wrote:
Do you have a branch for your small cop work?
It's very hacky and incomplete (and probably just plain broken), but https://github.com/Perl/perl5/tree/tonyc/less-cop
Nice, for what its worth ive been looking at replacing cop_file with a HEK. Which would allow the same code to be used to share the pv threads or otherwise.
Yves
-- perl -Mre=debug -e "/just|another|perl|hacker/"
On Sun, Sep 04, 2022 at 06:33:28PM -0700, Tony Cook wrote:
Even without adding a small COP we could improve memory usage a great deal by reference counting cop_warnings, and I think cop_file on threads, these are profligate users of memory - each cop has it's own copy.
An alternative approach perhaps would be to move most of the COP fields out to a separate ref-counted struct shared by each of the COPs in a sequence, where those fields haven't changed, with each COP reduced to little more than cop_line plus a pointer to the new struct.
-- Never do today what you can put off till tomorrow.
I have compiled perl from your branch and tested it on two pieces of code that use some float calculations.
First one is Algorithm::QuadTree::PP, which uses some (not much) float math in its circular shape finding routine. No improvement was seen.
The second one is more math-heavy, as it tries to find all border coordinates for a line segment. The heart of the function is implemented as follows:
my $coeff_x = ($position2->[1] - $position1->[1]) / ($position2->[0] - $position1->[0]);
my $checks_for_x = sub ($pos_x) {
state $partial = $position1->[1] - $position1->[0] * $coeff_x;
my $pos_y = $partial + $pos_x * $coeff_x;
return ([$pos_x, $pos_y], [$pos_x - 1, $pos_y]);
};
my $checks_for_y = sub ($pos_y) {
state $partial = $position1->[0] - $position1->[1] / $coeff_x;
my $pos_x = $partial + $pos_y / $coeff_x;
return ([$pos_x, $pos_y], [$pos_x, $pos_y - 1]);
};
my @coords = (
(map { $checks_for_x->($_) } $position1->[0] + 1 .. $position2->[0]),
(map { $checks_for_y->($_) } $position1->[1] + 1 .. $position2->[1])
);
Those two anonymous coderefs are then run for each integer coordinate of x and y. They are called about 20 times each and the entire function runs 40 thousand times per second, but I see no improvement on the benchmark if the function starts with use feature 'float';
(I expect this feature works in lexical scope).
I don't think I have anything else at the moment that has more float math in it.
I don't think I have anything else at the moment that has more float math in it.
I suspect sub call overhead is drowning the math costs.
From memory I used the following to benchmark it:
use strict;
my $max_iter = 100;
++$|;
for my $iy (0 .. 1000) {
my $y = -1 + 0.002 * $iy;
for my $ix (0 .. 1000) {
my $x = -1 + 0.002 * $ix;
my $i = 0;
my $xo = $x;
my $yo = $y;
my $iter = 0;
while ($xo * $xo + $yo * $yo <= 10 && ++$iter < $max_iter) {
($xo, $yo) = ( $xo * $xo - $yo * $yo + $x, 2 * $xo * $yo + $y);
}
}
print ".";
}
print "\n";
which I probably adapted from a C sample in Imager.
I suspect sub call overhead is drowning the math costs.
With all math commented out (but variable declarations etc. left in), it runs about 20% faster, so I assume math takes about 16% of its runtime. When benchmarking your code I see 20-40% improvement, which would mean my code should run about 5-10% faster (taking into account your code also spends some of its runtime assigning variables etc.). You're right, that might not be enough to show on a benchmark.
@tonycoz - i implemented RCPV filename and warnings bits, so we have redcuced the size of cops considerable (all together), so maybe we can reconsider making the hints bits bigger now?
Anyway, this PR is old and in conflict. Maybe we should get it rebased so it can be reconsidered?
I look at rebasing it, though probably not today.
I'll look at the extra hints word too, though I'm not sure we'll store it for eval (see where doeval_compile() initializes PL_hints).
It looks like you ran a performance test on a mandelbrot set generator in Perl, comparing the performance of using float versus not using float. The test showed that using float improved the performance by about 6 seconds, with the script running in 19.751 seconds with the float option versus 25.752 seconds without it.