perl5
perl5 copied to clipboard
COWification seems expensive in PADMY variables
From @Leont
This is a bug report for perl from fawaka@gmail.com, generated with the help of perlbug 1.40 running under perl 5.20.0.
It appears that in 5.20 returning string from a lexical variable is significantly slower that returning a dereferenced temporary value (return ${$foo}), while on previous versions without COW they were equally slow. This slowdown does not happen when the same function is called in void context or if the subroutine is :lvalue.
I would guess this is because the return creates a new temporary value out of the padmy variable, and this copy is not done as a COW and thus expensive. IMO this is rather unfortunate.
See attached script for a benchmark
Flags: category=core severity=low
Site configuration information for perl 5.20.0:
Configured by leon at Tue May 27 11:32:58 CEST 2014.
Summary of my perl5 (revision 5 version 20 subversion 0) configuration:
Platform: osname=linux, osvers=3.11.0-20-generic, archname=x86_64-linux-thread-multi uname='linux leon-laptop 3.11.0-20-generic #35-ubuntu smp fri may 2 21:32:49 utc 2014 x86_64 x86_64 x86_64 gnulinux ' config_args='-de -Dprefix=/home/leon/perl5/perlbrew/perls/perl-5.20.0 -Dusethreads -Duseshrplib' hint=recommended, useposix=true, d_sigaction=define useithreads=define, usemultiplicity=define use64bitint=define, use64bitall=define, uselongdouble=undef usemymalloc=n, bincompat5005=undef Compiler: cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64', optimize='-O2', cppflags='-D_REENTRANT -D_GNU_SOURCE -fwrapv -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include' ccversion='', gccversion='4.8.1', gccosandvers='' intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678 d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16 ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8 alignbytes=8, prototype=define Linker and Libraries: ld='cc', ldflags =' -fstack-protector -L/usr/local/lib' libpth=/usr/local/lib /usr/lib/gcc/x86_64-linux-gnu/4.8/include-fixed /usr/include/x86_64-linux-gnu /usr/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib libs=-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc -lgdbm_compat perllibs=-lnsl -ldl -lm -lcrypt -lutil -lpthread -lc libc=, so=so, useshrplib=true, libperl=libperl.so gnulibc_version='2.17' Dynamic Linking: dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E -Wl,-rpath,/home/leon/perl5/perlbrew/perls/perl-5.20.0/lib/5.20.0/x86_64-linux-thread-multi/CORE' cccdlflags='-fPIC', lddlflags='-shared -O2 -L/usr/local/lib -fstack-protector'
@INC for perl 5.20.0:
/home/leon/perl5/perlbrew/perls/perl-5.20.0/lib/site_perl/5.20.0/x86_64-linux-thread-multi /home/leon/perl5/perlbrew/perls/perl-5.20.0/lib/site_perl/5.20.0
/home/leon/perl5/perlbrew/perls/perl-5.20.0/lib/5.20.0/x86_64-linux-thread-multi /home/leon/perl5/perlbrew/perls/perl-5.20.0/lib/5.20.0 .
Environment for perl 5.20.0: HOME=/home/leon LANG=en_US.UTF-8 LANGUAGE=en_US:en LC_ADDRESS=en_US.UTF-8 LC_IDENTIFICATION=en_US.UTF-8 LC_MEASUREMENT=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_NAME=en_US.UTF-8 LC_NUMERIC=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_TELEPHONE=en_US.UTF-8 LC_TIME=en_US.UTF-8 LD_LIBRARY_PATH (unset) LOGDIR (unset)
PATH=/home/leon/perl5/perlbrew/bin:/home/leon/perl5/perlbrew/perls/perl-5.20.0/bin:/home/leon/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games PERLBREW_HOME=/home/leon/.perlbrew
PERLBREW_PATH=/home/leon/perl5/perlbrew/bin:/home/leon/perl5/perlbrew/perls/perl-5.20.0/bin PERLBREW_PERL=perl-5.20.0 PERLBREW_ROOT=/home/leon/perl5/perlbrew PERLBREW_VERSION=0.25 PERL_BADLANG (unset) SHELL=/bin/bash
From @jkeenan
On Wed May 28 13:27:16 2014, LeonT wrote:
This is a bug report for perl from fawaka@gmail.com, generated with the help of perlbug 1.40 running under perl 5.20.0.
-----------------------------------------------------------------
It appears that in 5.20 returning string from a lexical variable is significantly slower that returning a dereferenced temporary value (return ${$foo}), while on previous versions without COW they were equally slow. This slowdown does not happen when the same function is called in void context or if the subroutine is :lvalue.
[snip]
See attached script for a benchmark
Leon, I didn't get quite the same results as you did. Please see file attached which reports results on two Linux x86_64 machines and one Linux i686 machine.
Thank you very much. Jim Keenan
From @jkeenan
# current laptop $ which perl /home/jkeenan/perl5/perlbrew/perls/perl-5.20.0/bin/perl
$ uname -a Linux zareason 3.13.0-27-generic #50-Ubuntu SMP Thu May 15 18:06:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ perl 121977-leont-cow.pl 300000 Rate lvalue ref simple lvalue 12815/s -- -2% -2% ref 13049/s 2% -- -0% simple 13106/s 2% 0% -- Rate simple lvalue ref simple 4175/s -- -65% -65% lvalue 12029/s 188% -- -0% ref 12068/s 189% 0% --
# dromedary blead This is perl 5, version 21, subversion 1 (v5.21.1 (v5.21.0-69-g0fadf2d)) built for x86_64-linux $ ./perl -Ilib ../p5p/121977-leont-cow.pl 300000 Rate lvalue ref simple lvalue 8460/s -- -1% -1% ref 8535/s 1% -- -0% simple 8574/s 1% 0% -- Rate simple lvalue ref simple 4228/s -- -49% -49% lvalue 8299/s 96% -- -1% ref 8366/s 98% 1% --
# older Linode $ perl -v | head -3
This is perl 5, version 20, subversion 0 (v5.20.0) built for i686-linux
$ uname -a Linux li11-226 2.6.18.8-linode22 #1 SMP Tue Nov 10 16:12:12 UTC 2009 i686 GNU/Linux
$ perl 121977-leont-cow.pl 100000 # took nearly 6 minutes! Rate ref lvalue simple ref 1253/s -- -87% -87% lvalue 9524/s 660% -- -2% simple 9747/s 678% 2% -- Rate simple lvalue ref simple 1085/s -- -12% -13% lvalue 1227/s 13% -- -2% ref 1251/s 15% 2% --
The RT System itself - Status changed from 'new' to 'open'
From @Leont
On Thu, May 29, 2014 at 12:46 AM, James E Keenan via RT < perlbug-followup@perl.org> wrote:
On Wed May 28 13:27:16 2014, LeonT wrote:
This is a bug report for perl from fawaka@gmail.com, generated with the help of perlbug 1.40 running under perl 5.20.0.
-----------------------------------------------------------------
It appears that in 5.20 returning string from a lexical variable is significantly slower that returning a dereferenced temporary value (return ${$foo}), while on previous versions without COW they were equally slow. This slowdown does not happen when the same function is called in void context or if the subroutine is :lvalue.
[snip]
See attached script for a benchmark
Leon, I didn't get quite the same results as you did. Please see file attached which reports results on two Linux x86_64 machines and one Linux i686 machine.
Thank you very much. Jim Keenan
# current laptop $ which perl /home/jkeenan/perl5/perlbrew/perls/perl-5.20.0/bin/perl
$ uname -a Linux zareason 3.13.0-27-generic #50-Ubuntu SMP Thu May 15 18:06:16 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
$ perl 121977-leont-cow.pl 300000 Rate lvalue ref simple lvalue 12815/s -- -2% -2% ref 13049/s 2% -- -0% simple 13106/s 2% 0% -- Rate simple lvalue ref simple 4175/s -- -65% -65% lvalue 12029/s 188% -- -0% ref 12068/s 189% 0% --
# dromedary blead This is perl 5, version 21, subversion 1 (v5.21.1 (v5.21.0-69-g0fadf2d)) built for x86_64-linux $ ./perl -Ilib ../p5p/121977-leont-cow.pl 300000 Rate lvalue ref simple lvalue 8460/s -- -1% -1% ref 8535/s 1% -- -0% simple 8574/s 1% 0% -- Rate simple lvalue ref simple 4228/s -- -49% -49% lvalue 8299/s 96% -- -1% ref 8366/s 98% 1% --
Those are the results I was expecting.
# older Linode $ perl -v | head -3
This is perl 5, version 20, subversion 0 (v5.20.0) built for i686-linux
$ uname -a Linux li11-226 2.6.18.8-linode22 #1 SMP Tue Nov 10 16:12:12 UTC 2009 i686 GNU/Linux
$ perl 121977-leont-cow.pl 100000 # took nearly 6 minutes! Rate ref lvalue simple ref 1253/s -- -87% -87% lvalue 9524/s 660% -- -2% simple 9747/s 678% 2% -- Rate simple lvalue ref simple 1085/s -- -12% -13% lvalue 1227/s 13% -- -2% ref 1251/s 15% 2% --
I have no idea what's going on there.
Leon
From @iabyn
On Thu, May 29, 2014 at 12:22:42PM +0200, Leon Timmermans wrote:
This is perl 5, version 21, subversion 1 (v5.21.1 (v5.21.0-69-g0fadf2d)) built for x86_64-linux $ ./perl -Ilib ../p5p/121977-leont-cow.pl 300000 Rate lvalue ref simple lvalue 8460/s -- -1% -1% ref 8535/s 1% -- -0% simple 8574/s 1% 0% -- Rate simple lvalue ref simple 4228/s -- -49% -49% lvalue 8299/s 96% -- -1% ref 8366/s 98% 1% --
Those are the results I was expecting.
i.e. that although COW has made some things faster in 5.20.0 compared with 5.18, the 'simple' case hasn't seen the speedup seen by the other cases.
The commit below fixes the proximate cause. However, there were three things interacting with each other that together caused the issue. My commit fixes one of the 3 issues, and is enough to boost performance for this use case. Two other issues that I have not yet addressed are:
('x' x 1_000_000) is constant folded at compile time, and the COW code in sv_setsv_flags() is failing to do COW on something like
$buf = 'x' x 1_000_000;
and is copying instead. I think FC did some work on making COW work with RO values, so perhaps this is something that should work. Perhaps string constants need to be marked as COW (with RC==0) before making them read-only at compile time???
The second issue is that, to work around the problem with readline allocating a large buffer, which then got COWed and 'donated' in something like
push @a, $_ while <>;
we added a heuristic along the lines of 'copy rather than COW' if SvCUR * A < SvLEN for some constant factor A. The trouble is, this is clashing with sv_grow()'s
SvLEN = SvCUR * B;
for some fudge factor B (i.e. over-allocate when growing the buffer).
If B > A, we end up creating strings that can't be COWed. So we probably need to harmonise the constants involved in A and B.
Anyway, here's my commit:
commit a7ab896004fe7cc32eeddadf760d0829e9fed13d Author: David Mitchell <davem@iabyn.com> AuthorDate: Thu Jun 5 15:03:32 2014 +0100 Commit: David Mitchell <davem@iabyn.com> CommitDate: Thu Jun 5 15:03:32 2014 +0100
when unCOWing a string, set SvCUR to 0
When a COW string is unCOWed, as well as setting SvPVX to NULL and SvLEN
to 0, set SvCUR to 0 too.
This is to avoid a later SvGROW on the same using the old SvCUR() value
to calculate a roundup to the buffer size.
Consider the following code:
use Devel::Peek;
for (1..3) {
my $t;
my $s = 'x' x 100;
$t = $s;
Dump $s;
}
Looking at the LEN line of the Dump output, we got on 5.20.0:
LEN = 102
LEN = 135
LEN = 135
and after this commit,
LEN = 102
LEN = 102
LEN = 102
As well as wasting space, this extra LEN was then triggering the 'skip COW
if LEN >> CUR' mechanism, causing extra copies. See:
[perl #121977] COWification seems expensive in PADMY variables
-- A power surge on the Bridge is rapidly and correctly diagnosed as a faulty capacitor by the highly-trained and competent engineering staff. -- Things That Never Happen in "Star Trek" #9
From @mauke
On Thu Jun 05 08:16:02 2014, davem wrote:
On Thu, May 29, 2014 at 12:22:42PM +0200, Leon Timmermans wrote:
This is perl 5, version 21, subversion 1 (v5.21.1 (v5.21.0-69- g0fadf2d)) built for x86_64-linux $ ./perl -Ilib ../p5p/121977-leont-cow.pl 300000 Rate lvalue ref simple lvalue 8460/s -- -1% -1% ref 8535/s 1% -- -0% simple 8574/s 1% 0% -- Rate simple lvalue ref simple 4228/s -- -49% -49% lvalue 8299/s 96% -- -1% ref 8366/s 98% 1% --
Those are the results I was expecting.
i.e. that although COW has made some things faster in 5.20.0 compared with 5.18, the 'simple' case hasn't seen the speedup seen by the other cases.
The commit below fixes the proximate cause. However, there were three things interacting with each other that together caused the issue. My commit fixes one of the 3 issues, and is enough to boost performance for this use case. Two other issues that I have not yet addressed are:
('x' x 1_000_000) is constant folded at compile time, and the COW code in sv_setsv_flags() is failing to do COW on something like
$buf = 'x' x 1_000_000;
and is copying instead. I think FC did some work on making COW work with RO values, so perhaps this is something that should work. Perhaps string constants need to be marked as COW (with RC==0) before making them read-only at compile time???
The second issue is that, to work around the problem with readline allocating a large buffer, which then got COWed and 'donated' in something like
push @a, $_ while <>;
we added a heuristic along the lines of 'copy rather than COW' if SvCUR * A < SvLEN for some constant factor A. The trouble is, this is clashing with sv_grow()'s
SvLEN = SvCUR * B;
for some fudge factor B (i.e. over-allocate when growing the buffer).
If B > A, we end up creating strings that can't be COWed. So we probably need to harmonise the constants involved in A and B.
Anyway, here's my commit:
commit a7ab896004fe7cc32eeddadf760d0829e9fed13d Author: David Mitchell <davem@iabyn.com> AuthorDate: Thu Jun 5 15:03:32 2014 +0100 Commit: David Mitchell <davem@iabyn.com> CommitDate: Thu Jun 5 15:03:32 2014 +0100
when unCOWing a string, set SvCUR to 0
When a COW string is unCOWed, as well as setting SvPVX to NULL and SvLEN to 0, set SvCUR to 0 too.
This is to avoid a later SvGROW on the same using the old SvCUR() value to calculate a roundup to the buffer size.
Consider the following code:
use Devel::Peek; for (1..3) { my $t; my $s = 'x' x 100; $t = $s; Dump $s; }
Looking at the LEN line of the Dump output, we got on 5.20.0:
LEN = 102 LEN = 135 LEN = 135
and after this commit,
LEN = 102 LEN = 102 LEN = 102
As well as wasting space, this extra LEN was then triggering the 'skip COW if LEN >> CUR' mechanism, causing extra copies. See:
[perl #121977] COWification seems expensive in PADMY variables
This ticket is listed in perl5201delta. Is there still work happening here?
From @iabyn
On Fri, Feb 26, 2016 at 10:34:50AM -0800, l.mai@web.de via RT wrote:
This ticket is listed in perl5201delta. Is there still work happening here?
The other two issues I mentioned still appear to to be issues; so this ticket needs to remain open
-- Any [programming] language that doesn't occasionally surprise the novice will pay for it by continually surprising the expert. -- Larry Wall
https://github.com/Perl/perl5/pull/20595 was intended to fix one of the outstanding issues in this ticket. (Constant-folded strings not being COWed.)
Which means the remaining issue keeping this ticket open is:
... to work around the problem with readline allocating a large buffer, which then got COWed and 'donated' in something like
push @a, $_ while <>;
we added a heuristic along the lines of 'copy rather than COW' if SvCUR * A < SvLEN for some constant factor A. The trouble is, this is clashing with sv_grow()'s
SvLEN = SvCUR * B;
for some fudge factor B (i.e. over-allocate when growing the buffer).
If B > A, we end up creating strings that can't be COWed. So we probably need to harmonise the constants involved in A and B.