perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

suboptimal memory usage of repetition operator

Open p5pRT opened this issue 11 years ago • 22 comments

Migrated from rt.perl.org#121780 (status was 'open')

Searchable as RT121780$

p5pRT avatar May 01 '14 23:05 p5pRT

From [email protected]

suboptimal memory usage of repetition operator

=== my $s1; $s1 .= "x" for (1..200*1024*1024); print "ok\n"; sleep 1000; $s1 .= '';

Uses 200Mb (during sleep())

=== my $s1; $s1 = "x" x (200*1024*1024); print "ok\n"; sleep 1000; $s1 .= '';

Uses 400Mb (during sleep())

reproduced in 5.14 and 5.19

perl -V Summary of my perl5 (revision 5 version 14 subversion 2) configuration​:

  Platform​:   osname=linux, osvers=2.6.42-37-generic, archname=x86_64-linux-gnu-thread-multi   uname='linux panlong 2.6.42-37-generic #58-ubuntu smp thu jan 24 15​:28​:10 utc 2013 x86_64 x86_64 x86_64 gnulinux '   config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=x86_64-linux-gnu -Dprefix=/usr -Dprivlib=/usr/share/perl/5.14 -Darchlib=/usr/lib/perl/5.14 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.14.2 -Dsitearch=/usr/local/lib/perl/5.14.2 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dsiteman1dir=/usr/local/man/man1 -Dsiteman3dir=/usr/local/man/man3 -Duse64bitint -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Ud_ualarm -Uusesfio -Uusenm -Ui_libutil -DDEBUGGING=-g -Doptimize=-O2 -Duseshrplib -Dlibperl=libperl.so.5.14.2 -des'   hint=recommended, useposix=true, d_sigaction=define   useithreads=define, usemultiplicity=define   useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef   use64bitint=define, use64bitall=define, uselongdouble=undef   usemymalloc=n, bincompat5005=undef   Compiler​:   cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',   optimize='-O2 -g',   cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'   ccversion='', gccversion='4.6.3', gccosandvers=''   intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678   d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16   ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8   alignbytes=8, prototype=define   Linker and Libraries​:   ld='cc', ldflags =' -fstack-protector -L/usr/local/lib'   libpth=/usr/local/lib /lib/x86_64-linux-gnu /lib/../lib /usr/lib/x86_64-linux-gnu /usr/lib/../lib /lib /usr/lib   libs=-lgdbm -lgdbm_compat -ldb -ldl -lm -lpthread -lc -lcrypt   perllibs=-ldl -lm -lpthread -lc -lcrypt   libc=, so=so, useshrplib=true, libperl=libperl.so.5.14.2   gnulibc_version='2.15'   Dynamic Linking​:   dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,-E'   cccdlflags='-fPIC', lddlflags='-shared -O2 -g -L/usr/local/lib -fstack-protector'

Characteristics of this binary (from libperl)​:   Compile-time options​: MULTIPLICITY PERL_DONT_CREATE_GVSV   PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP   PERL_PRESERVE_IVUV USE_64_BIT_ALL USE_64_BIT_INT   USE_ITHREADS USE_LARGE_FILES USE_PERLIO USE_PERL_ATOF   USE_REENTRANT_API   Locally applied patches​:   DEBPKG​:debian/arm_thread_stress_timeout - http​://bugs.debian.org/501970 Raise the timeout of ext/threads/shared/t/stress.t to accommodate slower build hosts   DEBPKG​:debian/cpan_definstalldirs - Provide a sensible INSTALLDIRS default for modules installed from CPAN.   DEBPKG​:debian/db_file_ver - http​://bugs.debian.org/340047 Remove overly restrictive DB_File version check.   DEBPKG​:debian/doc_info - Replace generic man(1) instructions with Debian-specific information.   DEBPKG​:debian/enc2xs_inc - http​://bugs.debian.org/290336 Tweak enc2xs to follow symlinks and ignore missing @​INC directories.   DEBPKG​:debian/libnet_config_path - Set location of libnet.cfg to /etc/perl/Net as /usr may not be writable.   DEBPKG​:debian/m68k_thread_stress - http​://bugs.debian.org/517938 http​://bugs.debian.org/495826 Disable some threads tests on m68k for now due to missing TLS.   DEBPKG​:debian/mod_paths - Tweak @​INC ordering for Debian   DEBPKG​:debian/module_build_man_extensions - http​://bugs.debian.org/479460 Adjust Module​::Build manual page extensions for the Debian Perl policy   DEBPKG​:debian/prune_libs - http​://bugs.debian.org/128355 Prune the list of libraries wanted to what we actually need.   DEBPKG​:fixes/net_smtp_docs - [rt.cpan.org #36038] http​://bugs.debian.org/100195 Document the Net​::SMTP 'Port' option   DEBPKG​:debian/perlivp - http​://bugs.debian.org/510895 Make perlivp skip include directories in /usr/local   DEBPKG​:debian/disable-zlib-bundling - Disable zlib bundling in Compress​::Raw​::Zlib   DEBPKG​:debian/cpanplus_definstalldirs - http​://bugs.debian.org/533707 Configure CPANPLUS to use the site directories by default.   DEBPKG​:debian/cpanplus_config_path - Save local versions of CPANPLUS​::Config​::System into /etc/perl.   DEBPKG​:debian/deprecate-with-apt - http​://bugs.debian.org/580034 Point users to Debian packages of deprecated core modules   DEBPKG​:fixes/hurd-ccflags - [a190e64] http​://bugs.debian.org/587901 [perl #92244] Make hints/gnu.sh append to $ccflags rather than overriding them   DEBPKG​:debian/squelch-locale-warnings - http​://bugs.debian.org/508764 Squelch locale warnings in Debian package maintainer scripts   DEBPKG​:debian/skip-upstream-git-tests - Skip tests specific to the upstream Git repository   DEBPKG​:fixes/extutils-cbuilder-cflags - [011e8fb] http​://bugs.debian.org/624460 [perl #89478] Append CFLAGS and LDFLAGS to their Config.pm counterparts in EU​::CBuilder   DEBPKG​:fixes/module-build-home-directory - http​://bugs.debian.org/624850 [rt.cpan.org #67893] Fix failing tilde test when run under a UID without a passwd entry   DEBPKG​:debian/patchlevel - http​://bugs.debian.org/567489 List packaged patches for 5.14.2-6ubuntu2.4 in patchlevel.h   DEBPKG​:fixes/h2ph-multiarch - [e7ec705] http​://bugs.debian.org/625808 [perl #90122] Make h2ph correctly search gcc include directories   DEBPKG​:fixes/index-tainting - [3b36395] http​://bugs.debian.org/291450 [perl #64804] RT 64804​: tainting with index() of a constant   DEBPKG​:debian/skip-kfreebsd-crash - http​://bugs.debian.org/628493 [perl #96272] Skip a crashing test case in t/op/threads.t on GNU/kFreeBSD   DEBPKG​:fixes/document_makemaker_ccflags - http​://bugs.debian.org/628522 [rt.cpan.org #68613] Document that CCFLAGS should include $Config{ccflags}   DEBPKG​:fixes/sys-syslog-socket-timeout-kfreebsd.patch - http​://bugs.debian.org/627821 [rt.cpan.org #69997] Use a socket timeout on GNU/kFreeBSD to catch ICMP port unreachable messages   DEBPKG​:fixes/hurd-hints - http​://bugs.debian.org/636609 Improve general GNU hints, needed for GNU/Hurd.   DEBPKG​:fixes/pod_fixes - [7698aed] http​://bugs.debian.org/637816 Fix typos in several pod/perl*.pod files   DEBPKG​:debian/find_html2text - http​://bugs.debian.org/640479 Configure CPAN​::Distribution with correct name of html2text   DEBPKG​:fixes/digest_eval_hole - http​://bugs.debian.org/644108 Close the eval "require $module" security hole in Digest->new($algorithm)   DEBPKG​:fixes/hurd-ndbm - [f0d0a20] [perl #102680] http​://bugs.debian.org/645989 Add GNU/Hurd hints for NDBM_File   DEBPKG​:fixes/sysconf.t-posix - [8040185] [perl #102888] http​://bugs.debian.org/646016 Fix hang in ext/POSIX/t/sysconf.t on GNU/Hurd   DEBPKG​:fixes/hurd-largefile - [1fda587] [perl #103014] http​://bugs.debian.org/645790 enable LFS on GNU/Hurd   DEBPKG​:debian/hurd_test_todo_syslog - http​://bugs.debian.org/650093 Disable failing GNU/Hurd tests in cpan/Sys-Syslog/t/syslog.t   DEBPKG​:fixes/hurd_skip_itimer_virtual - [rt.cpan.org #72754] http​://bugs.debian.org/650094 Skip interval timer tests in Time​::HiRes on GNU/Hurd   DEBPKG​:debian/hurd_test_skip_socketpair - http​://bugs.debian.org/650186 Disable failing GNU/Hurd tests ext/Socket/t/socketpair.t   DEBPKG​:debian/hurd_test_skip_sigdispatch - http​://bugs.debian.org/650188 Disable failing GNU/Hurd tests op/sigdispatch.t   DEBPKG​:debian/hurd_test_skip_stack - http​://bugs.debian.org/650175 Disable failing GNU/Hurd tests dist/threads/t/stack.t   DEBPKG​:debian/hurd_test_skip_recv - http​://bugs.debian.org/650095 Disable failing GNU/Hurd tests cpan/autodie/t/recv.t   DEBPKG​:debian/hurd_test_skip_libc - http​://bugs.debian.org/650097 Disable failing GNU/Hurd tests dist/threads/t/libc.t   DEBPKG​:debian/hurd_test_skip_pipe - http​://bugs.debian.org/650187 Disable failing GNU/Hurd tests io/pipe.t   DEBPKG​:debian/hurd_test_skip_io_pipe - http​://bugs.debian.org/650096 Disable failing GNU/Hurd tests dist/IO/t/io_pipe.t   DEBPKG​:fixes/CVE-2012-5195 - avoid calling memset with a negative count   DEBPKG​:fixes/CVE-2012-5526 - [PATCH 1/4] CR escaping for P3P header   DEBPKG​:CVE-2013-1667.patch - [PATCH] Prevent premature hsplit() calls, and only trigger REHASH after hsplit()   DEBPKG​:CVE-2012-6329.patch - http​://bugs.debian.org/cgi-bin/bugreport.cgi?bug=695224 [1735f6f] fix arbitrary command execution via _compile function in Maketext.pm   Built under linux   Compiled at Feb 4 2014 23​:11​:19   %ENV​:   PERLBREW_BASHRC_VERSION="0.67"   PERLBREW_HOME="/home/vse/.perlbrew"   PERLBREW_MANPATH=""   PERLBREW_PATH="/home/perlbrew/bin"   PERLBREW_ROOT="/home/perlbrew"   PERLBREW_VERSION="0.67"   @​INC​:   /etc/perl   /usr/local/lib/perl/5.14.2   /usr/local/share/perl/5.14.2   /usr/lib/perl5   /usr/share/perl5   /usr/lib/perl/5.14   /usr/share/perl/5.14   /usr/local/lib/site_perl   .

p5pRT avatar May 01 '14 23:05 p5pRT

From @druud62

On 2014-05-02 01​:23, Виктор Ефимов wrote​:

my $s1; $s1 = "x" x (200*1024*1024); print "ok\n"; sleep 1000; $s1 .= '';

Uses 400Mb (during sleep())

reproduced in 5.14 and 5.19

This "similar thing" uses 200 MB in 5.18, but 100 MB in 5.19​:

perl -wE'   open my $fh, "<", "100_MB.bin" or die $!;   my $s = do { local $/; <$fh> };   sleep 1000; ' &

In 5.18 it is better coded like​:

my $s; { local $/; $s = <$fh> }

-- Ruud

p5pRT avatar May 02 '14 08:05 p5pRT

The RT System itself - Status changed from 'new' to 'open'

p5pRT avatar May 02 '14 08:05 p5pRT

From @abigail

On Thu, May 01, 2014 at 04​:23​:33PM -0700, Виктор Ефимов wrote​:

# New Ticket Created by Виктор Ефимов # Please include the string​: [perl #121780] # in the subject line of all future correspondence about this issue. # <URL​: https://rt-archive.perl.org/perl5/Ticket/Display.html?id=121780 >

suboptimal memory usage of repetition operator

=== my $s1; $s1 .= "x" for (1..200*1024*1024); print "ok\n"; sleep 1000; $s1 .= '';

Uses 200Mb (during sleep())

=== my $s1; $s1 = "x" x (200*1024*1024); print "ok\n"; sleep 1000; $s1 .= '';

Uses 400Mb (during sleep())

That's because the latter contains "x" x (200*1024*1024) twice. It exists as a RHS value (calculated at compile time), which is then copied into $s1. Perl doesn't throw away the value, as it doesn't know whether it will be needed again.

The former just has "x" on the RHS, and that gets reused 200*1024*1024 times.

You can reduce the memory usage of the latter by forcing Perl to release the memory of the expression as soon as it's done by placing it in a string eval. But you'll pay if you execute it more than once in the life time of the process​:

  my $s1;   $s1 = eval '"x" x (200*1024*1024)';   print "ok\n";   sleep 1000;   $s1 .= '';

Abigail

p5pRT avatar May 02 '14 08:05 p5pRT

From [email protected]

2014-05-02 12​:38 GMT+04​:00 Abigail <abigail@​abigail.be>​:

That's because the latter contains "x" x (200*1024*1024) twice. It exists as a RHS value (calculated at compile time), which is then

at compile time? but same happening when it's runtime (no constants)​:

my $s1; my $size = 200*1024*1024; my $char = "x"; $s1 = $char x $size; print "ok\n"; sleep 1000;

p5pRT avatar May 02 '14 08:05 p5pRT

From @abigail

On Fri, May 02, 2014 at 12​:45​:05PM +0400, Victor Efimov wrote​:

2014-05-02 12​:38 GMT+04​:00 Abigail <abigail@​abigail.be>​:

That's because the latter contains "x" x (200*1024*1024) twice. It exists as a RHS value (calculated at compile time), which is then

at compile time? but same happening when it's runtime (no constants)​:

my $s1; my $size = 200*1024*1024; my $char = "x"; $s1 = $char x $size; print "ok\n"; sleep 1000;

Same thing happens, except it now does it at run time. It calculates the RHS, keeps the value, and puts a copy of the value in $s1.

Abigail

p5pRT avatar May 02 '14 08:05 p5pRT

From [email protected]

indeed. That is _not_ issue when RHS side copied to LHS side and immediately discarded (but perl did not release memory back to OS).

=== my $size = 200*1024*1024; my $char = "x"; my $s1; $s1 = $char x $size; my $s2; $s2 = $char x $size; print "ok\n";

(800 Mb)

Instead, RHS of each statement saved for later re-use (through I don't understand why it needed and how it can be reused).

And that should take 600Mb then, but it takes 800 too​:

=== my $size = 200*1024*1024; my $char = "x"; {   my $s1;   $s1 = $char x $size; } {   my $s2;   $s2 = $char x $size; }

print "ok\n"; sleep 1000;

===

2014-05-02 12​:53 GMT+04​:00 Abigail <abigail@​abigail.be>​:

On Fri, May 02, 2014 at 12​:45​:05PM +0400, Victor Efimov wrote​:

2014-05-02 12​:38 GMT+04​:00 Abigail <abigail@​abigail.be>​:

That's because the latter contains "x" x (200*1024*1024) twice. It exists as a RHS value (calculated at compile time), which is then

at compile time? but same happening when it's runtime (no constants)​:

my $s1; my $size = 200*1024*1024; my $char = "x"; $s1 = $char x $size; print "ok\n"; sleep 1000;

Same thing happens, except it now does it at run time. It calculates the RHS, keeps the value, and puts a copy of the value in $s1.

Abigail

p5pRT avatar May 02 '14 09:05 p5pRT

From @abigail

On Fri, May 02, 2014 at 01​:09​:18PM +0400, Victor Efimov wrote​:

indeed. That is _not_ issue when RHS side copied to LHS side and immediately discarded (but perl did not release memory back to OS).

=== my $size = 200*1024*1024; my $char = "x"; my $s1; $s1 = $char x $size; my $s2; $s2 = $char x $size; print "ok\n";

(800 Mb)

Instead, RHS of each statement saved for later re-use (through I don't understand why it needed and how it can be reused).

It keeps the internal structures around, so it later doesn't have to build them again.

And that should take 600Mb then, but it takes 800 too​:

=== my $size = 200*1024*1024; my $char = "x"; { my $s1; $s1 = $char x $size; } { my $s2; $s2 = $char x $size; }

print "ok\n"; sleep 1000;

Nothing is reused here, you end up with two pairs of two copies​: the first $char x $size, and $s1, and the second $char x $size and $s2.

This, however, will take about 600Mb​:

  my $size = 200*1024*1024;   my $char = "x";   my @​s;   for (1 .. 2) {   push @​s => $char x $size;   print "ok\n";   }   sleep 1000;

Here we have three copies​: $char x $size, $s [0] and $s [1].

Abigail

p5pRT avatar May 02 '14 09:05 p5pRT

From [email protected]

2014-05-02 13​:23 GMT+04​:00 Abigail <abigail@​abigail.be>​:

Nothing is reused here, you end up with two pairs of two copies​: the first $char x $size, and $s1, and the second $char x $size and $s2.

But isn't that a memory leak? $s1 goes out of scope and should be discarded, and its memory re-used.

p5pRT avatar May 02 '14 09:05 p5pRT

From @Leont

On Fri, May 2, 2014 at 10​:45 AM, Victor Efimov <efimov@​reg.ru> wrote​:

2014-05-02 12​:38 GMT+04​:00 Abigail <abigail@​abigail.be>​:

That's because the latter contains "x" x (200*1024*1024) twice. It exists as a RHS value (calculated at compile time), which is then

at compile time? but same happening when it's runtime (no constants)​:

my $s1; my $size = 200*1024*1024; my $char = "x"; $s1 = $char x $size; print "ok\n"; sleep 1000;

You are correct that that is a bug. It seems PADTMP's were not treated as temporaries, while TEMP sv's where (as is evidenced by the eval in Abigail's example). Fortunately, it seems Father C has already fixed this in 9ffd39ab7.

Leon

p5pRT avatar May 02 '14 11:05 p5pRT

From @iabyn

On Fri, May 02, 2014 at 10​:38​:16AM +0200, Abigail wrote​:

That's because the latter contains "x" x (200*1024*1024) twice. It exists as a RHS value (calculated at compile time),

I rather wonder whether we should not in fact be constant-folding the repeat operator. For example

  if ($unlikely_condition) {   $x = 'x' x 200_000_000;   }   else   $x = '';   }

will consume 200Mb even if that branch is never taken.

-- O Unicef Clearasil! Gibberish and Drivel!   -- "Bored of the Rings"

p5pRT avatar May 02 '14 14:05 p5pRT

From @abigail

On Fri, May 02, 2014 at 03​:22​:17PM +0100, Dave Mitchell wrote​:

On Fri, May 02, 2014 at 10​:38​:16AM +0200, Abigail wrote​:

That's because the latter contains "x" x (200*1024*1024) twice. It exists as a RHS value (calculated at compile time),

I rather wonder whether we should not in fact be constant-folding the repeat operator. For example

if \($unlikely\_condition\) \{
    $x = 'x' x 200\_000\_000;
\}
else
    $x = '';
\}

will consume 200Mb even if that branch is never taken.

Yeah, that's what I meant by "calculated at compile time".

I'm very aware of the effects of constant-folding, once bringing down the website of $WORK, because it ran out of memory due to code that wasn't even executed (but constant folded).

Abigail

p5pRT avatar May 02 '14 14:05 p5pRT

From @lizmat

On 02 May 2014, at 16​:27, Abigail <abigail@​abigail.be> wrote​:

On Fri, May 02, 2014 at 03​:22​:17PM +0100, Dave Mitchell wrote​:

On Fri, May 02, 2014 at 10​:38​:16AM +0200, Abigail wrote​:

That's because the latter contains "x" x (200*1024*1024) twice. It exists as a RHS value (calculated at compile time),

I rather wonder whether we should not in fact be constant-folding the repeat operator. For example

if ($unlikely_condition) { $x = 'x' x 200_000_000; } else $x = ''; }

will consume 200Mb even if that branch is never taken.

Yeah, that's what I meant by "calculated at compile time".

I'm very aware of the effects of constant-folding, once bringing down the website of $WORK, because it ran out of memory due to code that wasn't even executed (but constant folded).

Yeah. Talking about side-effects :-)

Liz

p5pRT avatar May 02 '14 14:05 p5pRT

From @tsee

On 05/02/2014 04​:22 PM, Dave Mitchell wrote​:

On Fri, May 02, 2014 at 10​:38​:16AM +0200, Abigail wrote​:

That's because the latter contains "x" x (200*1024*1024) twice. It exists as a RHS value (calculated at compile time),

I rather wonder whether we should not in fact be constant-folding the repeat operator. For example

 if \($unlikely\_condition\) \{
     $x = 'x' x 200\_000\_000;
 \}
 else
     $x = '';
 \}

will consume 200Mb even if that branch is never taken.

I think this is a corner case. Few real programs have string constants that large and if they do, they generally do for a reason.

This is always a trade-off. Advocatus diaboli​: Should we stop allocation of scratchpads/targets because it's additional memory in a branch we never take? (Obviously, that would be a bad call.)

Constant folding is not rocket science. People generally know it's going to happen and deal with the fallout. The case that Abigail described was vastly more intricate than the above and it's the only such issue I've ever heard of.

Maybe the proper solution (in an ideal world, I mean) would be constant folding being performed at run-time if the branch is first taken (or only after it's taken N times). Which gets kind of uncomfortably close to tracing JIT compilation. (I know about all the practical issues such as not allowing modification of OP trees at run time.)

--Steffen

p5pRT avatar May 03 '14 09:05 p5pRT

From @avar

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

  @​cache {1 .. 999_999} = ();

I.e. just perl sizing the variable at compile-time even though it might not be used. Hardly vastly more intricate, or were you thinking about some other case?

p5pRT avatar May 03 '14 10:05 p5pRT

From @abigail

On Sat, May 03, 2014 at 12​:44​:19PM +0200, Ævar Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

@&#8203;cache \{1 \.\. 999\_999\} = \(\);

I.e. just perl sizing the variable at compile-time even though it might not be used. Hardly vastly more intricate, or were you thinking about some other case?

That was the one.

Abigail

p5pRT avatar May 03 '14 19:05 p5pRT

From @lizmat

On 03 May 2014, at 21​:30, Abigail <abigail@​abigail.be> wrote​:

On Sat, May 03, 2014 at 12​:44​:19PM +0200, Ævar Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

@​cache {1 .. 999_999} = ();

I.e. just perl sizing the variable at compile-time even though it might not be used. Hardly vastly more intricate, or were you thinking about some other case?

That was the one.

Yup, the one.

Although the hash was *not* sized at compile time​: that happened at runtime using the array created by 1..999999 at compile time. :-)

So the idea behind this was doing an optimisation at runtime, with unfortunate severe side-effects at compile time.

Liz

p5pRT avatar May 03 '14 19:05 p5pRT

From @lizmat

On 03 May 2014, at 21​:53, Elizabeth Mattijsen <liz@​dijkmat.nl> wrote​:

On 03 May 2014, at 21​:30, Abigail <abigail@​abigail.be> wrote​:

On Sat, May 03, 2014 at 12​:44​:19PM +0200, Ævar Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

@​cache {1 .. 999_999} = ();

I.e. just perl sizing the variable at compile-time even though it might not be used. Hardly vastly more intricate, or were you thinking about some other case?

That was the one.

Yup, the one.

Although the hash was *not* sized at compile time​: that happened at runtime using the array created by 1..999999 at compile time. :-)

So the idea behind this was doing an optimisation at runtime, with unfortunate severe side-effects at compile time.

Actually, going down memory lane, the very quick fix was​:

  eval ‘@​cache {1 .. 999_999} = ()’;

Liz

p5pRT avatar May 03 '14 20:05 p5pRT

From @perhunter

On 05/03/2014 04​:14 PM, Elizabeth Mattijsen wrote​:

Actually, going down memory lane, the very quick fix was​:

eval ‘@​cache {1 .. 999_999} = ()’;

why not a quick fix of​:

  $max = 999_999 ;   @​cache{ 1 .. $max } = () ;

that shouldn't be able to trigger any compile time 'optimizations' of the generated list.

uri

-- Uri Guttman - The Perl Hunter The Best Perl Jobs, The Best Perl Hackers http​://PerlHunter.com

p5pRT avatar May 03 '14 23:05 p5pRT

From @tsee

On 05/03/2014 12​:44 PM, Ævar Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

 @&#8203;cache \{1 \.\. 999\_999\} = \(\);

I.e. just perl sizing the variable at compile-time even though it might not be used. Hardly vastly more intricate, or were you thinking about some other case?

Then I misremembered. But again, this is just a case of "don't do that" if you ask me. And the above could rather easily be spotted.

--Steffen

p5pRT avatar May 04 '14 07:05 p5pRT

From @abigail

On Sun, May 04, 2014 at 09​:28​:48AM +0200, Steffen Mueller wrote​:

On 05/03/2014 12​:44 PM, Ævar Arnfjörð Bjarmason wrote​:

On Sat, May 3, 2014 at 11​:14 AM, Steffen Mueller <smueller@​cpan.org> wrote​:

The case that Abigail described was vastly more intricate than the above and it's the only such issue I've ever heard of.

Are we remembering different issues? The one I remember came down to​:

 @&#8203;cache \{1 \.\. 999\_999\} = \(\);

I.e. just perl sizing the variable at compile-time even though it might not be used. Hardly vastly more intricate, or were you thinking about some other case?

Then I misremembered. But again, this is just a case of "don't do that"
if you ask me. And the above could rather easily be spotted.

How many Perl programmers outside of p5p have any idea what perl is doing at compile time? I'd wager most Perl programmers have no idea what constant folding is.

It's also rarely documented. Perl, the language, is documented well, in man pages, books and online. But the runtime, its optimizations, and its actions at compile time? Far less.

Now, I'm not arguing things should change. But compile time actions will keep biting people. It certainly bit me.

Abigail

p5pRT avatar May 04 '14 14:05 p5pRT

From @bulk88

On Sat May 03 02​:14​:59 2014, smueller@​cpan.org wrote​:

I think this is a corner case. Few real programs have string constants that large and if they do, they generally do for a reason.

I agree with that. If someone will write something like that, it has to have some legitimate purpose. If you write something like that, prepare for your process to have the available free memory to do that. If you don't have the free memory to do that, your accepting the risk of a random process termination and P5P can't help you with that.

This is always a trade-off. Advocatus diaboli​: Should we stop allocation of scratchpads/targets because it's additional memory in a branch we never take? (Obviously, that would be a bad call.)

Constant folding is not rocket science. People generally know it's going to happen and deal with the fallout. The case that Abigail described was vastly more intricate than the above and it's the only such issue I've ever heard of.

Maybe the proper solution (in an ideal world, I mean) would be constant folding being performed at run-time if the branch is first taken (or only after it's taken N times). Which gets kind of uncomfortably close to tracing JIT compilation. (I know about all the practical issues such as not allowing modification of OP trees at run time.)

I have a different idea but my recommendation on this ticket is to do nothing, if the constant folding results in an SV with more than a 10 MB value, reinstate the original optree.

-- bulk88 ~ bulk88 at hotmail.com

p5pRT avatar May 10 '14 21:05 p5pRT