perl5 Part 1 of implementing RFC0013 (join)

Nov 11 '22 11:11 book

General comments, even though these are static functions they really should be registered in embed.fnc, and you should use the generated wrapper macros for accessing the sub, not explicit calls to the fully prefixed function. This will make peoples life easier if the subs need to be changed to be exported as code gets refactored and saves the need to add the annoying aTHX_ bumf at the start of the function calls. Also you should be providing pod for the new functions, and using the generated PERL_ARGS_ macros to validate that your arguments are correct. (For instance it will generate the appropriate code to validate that your arguments are non-null if specificed, etc).

Also you should use the standard symbols perl uses internally for specifying static, and in modern code we normally don't use 'int' where 'bool' will do.

So basically, the code looks ok, aside from some minor nits and the fact it is not using the proper build machinery for declaring subs.

Nov 13 '22 09:11 demerphq

General comments from me too (I haven't reviewed the correctness of the code). I see no point in splitting most of this into separate commits - for example, in commit N, adding a function S_foo(), then in commit N+1 making use of S_foo() in one place. Commits N and N+1 are semantically part of the same commit - neither is any use without the other. Splitting it just makes it harder to understand what's being changed. Also, the new functions do not have any comments or pod describing what they are for. And in particular in the case of S_sv_has_magic_method(), it's badly named. Magic encompasses a lot more things than just overloading - e.g. tie magic has methods too (like FETCH). Perhaps it should be called S_sv_has_overload_method() or similar instead? And the docs should make clear what the method arg is. I'm surprised that that there aren't already API functions to do this sort of thing?

Nov 13 '22 14:11 iabyn

Also it would be useful to use the perf infrastructure to get some before/after comparison benchmarks on a variety of join test cases, to get a feel for what kind of impact this will have on performance.

It's much slower.

My test code:

./perl -e 'my @x = ("a" x 100) x 1000; for my $i (0 .. 10000) { my $z = join(",", @x); }'

Summary, minimums of three runs:

blead     #20503   Measure
148.00    234.29   ms "task-clock"
2.031     3.314    G instructions
472.4     843.1    M branches

Making S_concat_maybe_overloaded() inline and applying the suggested flags optimization in that function improved these to 189.09/2.794/663.1

Raw results from perl stat

#20503 (rebased on blead @99d24a03f4f65d155416e04274b943ae7245dc21:

            238.42 msec task-clock                #    0.999 CPUs utilized          
                 1      context-switches          #    0.004 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               291      page-faults               #    0.001 M/sec                  
     1,101,882,222      cycles                    #    4.622 GHz                    
     3,314,272,902      instructions              #    3.01  insn per cycle         
       843,052,837      branches                  # 3536.025 M/sec                  
            75,232      branch-misses             #    0.01% of all branches        

       0.238678574 seconds time elapsed

       0.238690000 seconds user
       0.000000000 seconds sys

            239.34 msec task-clock                #    0.999 CPUs utilized          
                 0      context-switches          #    0.000 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               294      page-faults               #    0.001 M/sec                  
     1,110,324,651      cycles                    #    4.639 GHz                    
     3,314,323,674      instructions              #    2.99  insn per cycle         
       843,062,902      branches                  # 3522.382 M/sec                  
            92,861      branch-misses             #    0.01% of all branches        

       0.239640817 seconds time elapsed

       0.235731000 seconds user
       0.003995000 seconds sys

            234.29 msec task-clock                #    0.999 CPUs utilized          
                 2      context-switches          #    0.009 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               291      page-faults               #    0.001 M/sec                  
     1,098,648,396      cycles                    #    4.689 GHz                    
     3,314,298,427      instructions              #    3.02  insn per cycle         
       843,058,695      branches                  # 3598.403 M/sec                  
            90,207      branch-misses             #    0.01% of all branches        

       0.234566521 seconds time elapsed

       0.234606000 seconds user
       0.000000000 seconds sys

20503 (-g -O2): (-O2 -g was slower)

            237.04 msec task-clock                #    0.996 CPUs utilized          
                 1      context-switches          #    0.004 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               291      page-faults               #    0.001 M/sec                  
     1,095,326,580      cycles                    #    4.621 GHz                    
     3,314,319,311      instructions              #    3.03  insn per cycle         
       843,061,149      branches                  # 3556.615 M/sec                  
            90,923      branch-misses             #    0.01% of all branches        

       0.238089997 seconds time elapsed

       0.238116000 seconds user
       0.000000000 seconds sys


blead@99d24a03f4f65d155416e04274b943ae7245dc21:

            153.64 msec task-clock                #    0.998 CPUs utilized          
                 1      context-switches          #    0.007 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               294      page-faults               #    0.002 M/sec                  
       694,223,108      cycles                    #    4.518 GHz                    
     2,031,365,472      instructions              #    2.93  insn per cycle         
       472,374,561      branches                  # 3074.462 M/sec                  
            93,170      branch-misses             #    0.02% of all branches        

       0.153877924 seconds time elapsed

       0.153928000 seconds user
       0.000000000 seconds sys

            156.10 msec task-clock                #    0.998 CPUs utilized          
                 0      context-switches          #    0.000 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               291      page-faults               #    0.002 M/sec                  
       729,871,803      cycles                    #    4.676 GHz                    
     2,031,331,946      instructions              #    2.78  insn per cycle         
       472,367,612      branches                  # 3026.138 M/sec                  
            93,624      branch-misses             #    0.02% of all branches        

       0.156334566 seconds time elapsed

       0.156392000 seconds user
       0.000000000 seconds sys

            148.00 msec task-clock                #    0.998 CPUs utilized          
                 0      context-switches          #    0.000 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               291      page-faults               #    0.002 M/sec                  
       692,560,476      cycles                    #    4.680 GHz                    
     2,031,339,986      instructions              #    2.93  insn per cycle         
       472,369,311      branches                  # 3191.737 M/sec                  
           126,137      branch-misses             #    0.03% of all branches        

       0.148277307 seconds time elapsed

       0.148306000 seconds user
       0.000000000 seconds sys

With inline and flags optimization:
            189.09 msec task-clock                #    0.999 CPUs utilized          
                 0      context-switches          #    0.000 K/sec                  
                 0      cpu-migrations            #    0.000 K/sec                  
               294      page-faults               #    0.002 M/sec                  
       880,888,042      cycles                    #    4.659 GHz                    
     2,794,500,090      instructions              #    3.17  insn per cycle         
       663,114,388      branches                  # 3506.911 M/sec                  
            80,173      branch-misses             #    0.01% of all branches        

       0.189343391 seconds time elapsed

       0.189393000 seconds user
       0.000000000 seconds sys

Nov 28 '22 00:11 tonycoz

On Mon, 28 Nov 2022 at 01:48, Tony Cook @.***> wrote:

Also it would be useful to use the perf infrastructure to get some before/after comparison benchmarks on a variety of join test cases, to get a feel for what kind of impact this will have on performance.

It's much slower.

FWIW, join is not very efficient. We did a bunch of analysis on this at Booking at one point when I was there, and we found that .= is more performant than join is. Code generating code that does something like

(((((($x.=$ar[0]).=$sep).=$ar[1]).=$sep).=$ar[2])

etc, runs faster than calling join on the same array. One would assume it would be the other way around, but it isnt.

When we looked closely at the code for join we were a bit surprised at what we found. It didnt look like what you would expect.

Eg, id expect that the code would loop over the array, compute how big the chunks are, then allocate a buffer big enough for all of them, and then loop again, and copy them one by one into the result. Join doesn't work like that. :-(

It's funny, actually, how many places that perls hyper dynamic nature makes life ultra painful. Consider what happens if you tie the separator value for join? You cant compute its size once, and then join using that string, you need to call it every element, and deal with the fact that it can change size every element. It is as though perl were designed to be slow. (Sadly.)

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

Nov 28 '22 09:11 demerphq

On Mon, 28 Nov 2022 at 10:01, demerphq @.***> wrote:

On Mon, 28 Nov 2022 at 01:48, Tony Cook @.***> wrote:

Also it would be useful to use the perf infrastructure to get some before/after comparison benchmarks on a variety of join test cases, to get a feel for what kind of impact this will have on performance.

It's much slower.

FWIW, join is not very efficient. We did a bunch of analysis on this at Booking at one point when I was there, and we found that .= is more performant than join is. Code generating code that does something like

(((((($x.=$ar[0]).=$sep).=$ar[1]).=$sep).=$ar[2])

etc, runs faster than calling join on the same array. One would assume it would be the other way around, but it isnt.

When we looked closely at the code for join we were a bit surprised at what we found. It didnt look like what you would expect.

Eg, id expect that the code would loop over the array, compute how big the chunks are, then allocate a buffer big enough for all of them, and then loop again, and copy them one by one into the result. Join doesn't work like that. :-(

It's funny, actually, how many places that perls hyper dynamic nature makes life ultra painful. Consider what happens if you tie the separator value for join? You cant compute its size once, and then join using that string, you need to call it every element, and deal with the fact that it can change size every element. It is as though perl were designed to be slow. (Sadly.)

Based on the benchmarks for a recent perl, whatever it was join was doing wrong when we did these benchmarks back in the day doesnt seem to be an issue anymore. Sorry for the noise.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

Nov 28 '22 09:11 demerphq

Thank you @tonycoz for suggesting perf and an example exercise script; that helped us take a more informed approach.

@demerphq , @leonerd - I believe we've taken on-board the observations and suggestions; we'd be appreciative if you could take another look.

Dec 16 '22 14:12 ericherman

Will you write a perldelta.pod entry in this commit, or shortly after? Either seems fine.

Done, as part of this commit.

Jan 14 '23 08:01 ericherman

On Sat, 14 Jan 2023, 09:48 Eric Herman, @.***> wrote:

@.**** commented on this pull request.

In doop.c https://github.com/Perl/perl5/pull/20503#discussion_r1070237569:
     }
 }
 else {
   for (; items > 0; items--,mark++)
   {
       STRLEN len;
       const char *s = SvPV_const(*mark,len);
       sv_catpvn_flags(sv,s,len,
                       DO_UTF8(*mark) ? SV_CATUTF8 : SV_CATBYTES);
   for (; items > 0; items--,mark++) {
       sv = do_join_inner(sv, *mark);
Isn't the reason we assign the sv each time is that the return value may not be the same sv, especially in the case where the first element didn't have overload concat magic, but the second element does, after which the lhs gets the concat magic? Checking once would break that, right?

The lhs doesn't become magic by concatenation of something that is. That would be very weird. Strings don't turn into objects by concatenation, nor do they somehow become tied without calling the tie operator. :-) magic isn't viral like that.*

So no, checking once should purely improve performance without altering behaviour.

Yves

ruud has proposed a new magic that might work like that, but it is no more than an idea, and I wouldn't worry about it for this patch, assuming his idea does come to fruition it will be a long time and a lot of patches before it gets merged and for the mean time you should forget about it. Especially as many are doubtful about the idea.

Jan 14 '23 09:01 demerphq

The lhs doesn't become magic by concatenation of something that is. That would be very weird. Strings don't turn into objects by concatenation, nor do they somehow become tied without calling the tie operator. :-) magic isn't viral like that.

I might be misunderstanding your point. The lhs in the S_do_join_inner function will be the result of the overall join(). It starts as the empty string, and we keep concatenating to it from left to right.

In RFC0013, we want join( $delim, @list ) to behave exactly like reduce { ( $a . $delim ) . $b } @list. If one of the arguments to join() is an object with a . overload that returns another object with an overloaded ., then the overload will propagate, and the result of join() will be an overloaded object.

This is the behavior we've put in the BoldStars class in our test script (using . on a BoldStars object produces a BoldStars object). If one of the arguments to join() is one such object, then it will "assimilate" the other arguments with each concatenation, and the result of join() will be an overloaded object.

Jan 16 '23 07:01 book

On Mon, 16 Jan 2023 at 08:59, Philippe Bruhat (BooK) < @.***> wrote:

The lhs doesn't become magic by concatenation of something that is. That would be very weird. Strings don't turn into objects by concatenation, nor do they somehow become tied without calling the tie operator. :-) magic isn't viral like that.

I might be misunderstanding your point. The lhs in the S_do_join_inner function will be the result of the overall join(). It starts as the empty string, and we keep concatenating to it from left to right.

In RFC0013, we want join( $delim, @list ) to behave exactly like reduce { ( $a . $delim ) . $b } @list. If one of the arguments to join() is an object with a . overload that returns another object with an overloaded ., then the overload will propagate, and the result of join() will be an overloaded object.

Hmm. I sit corrected, usually in core code when you ask for a PV (a string) you get it and copy it into something else. It never occurred to me the sv_catsv(a,b) might potentially change a. But obviously the logic you describe at the perl level is sound. But my point still stands really, the number of times we check the lhs should be at most the number of elements with magic on the right hand side. We shouldn't keep checking the lhs over and over. If the rhs has magic, then we do the concatenation, and then we check if the result has magic, etc.

I mean, consider, this code could scan the entire array, check if each element has the required magic, and then if it doesn't do something optimal to build the result. And it would only check magic on each element once. So we know we can solve this problem without checking twice as many times as we need to.

Also, I checked, currently the delimiter appears to be stringified once, on its first use, not each time. So changing this to call concatenate instead means that potentially the stringify would not be called.

I have to admit I find the idea that join might return an object kinda concerning. You could just as easily argue that join should be "stringify each element on the stack in turn placing delimeters between each element," (which is actually what happens now). As opposed to concatenation at the operator overload level.

Now that I understand this more I have to say I kinda question this RFC. Currently this:

join("",$thing) and "$thing"

are equivalent. This RFC would change that equivalence. That feels like a super big change. Should this maybe be feature flagged?

I feel like there is a conceptual clash here between overloaded "concatenation" and "stringification".

Especially in implicit contexts. For instance I checked and "$x$y$p$q" is processed as $t = ((($x.$y).$p).$q); "$t".

But you could argue that is an implementation detail and if we choose we should be able to evaluate it as $t=(($x.$y).($p.$q)); "$t"

You could also argue that since "$x" is the stringification operator, that it should actually be parsed as (("$x" . "$y").("$p"."$q")) or whatever order of operations we choose. We shouldn't even guarantee the order in which we choose to stringify the arguments.

I feel like maybe the concatenation operator and this rfc needs to be reconsidered. The RFC imposes a very specific execution order on processing an operation that isn't strictly necessary to accomplishing that operation and is the worst order possible from the of view of parallelizing the operation. And it seems like it breaks a fundamental expectation that the return from join will be a string, just like the return from a double quoted string is a string. It also kinda breaks the mathematics of strings. Shouldn't $x . "" eq "$x" eq join "", $x ? With this rfc they won't be right?

It seems like a bug that "$x$y" is parsed as $t = $x . $y; "$t" instead of "$x" . "$y".

This is the behavior we've put in the BoldStars class in our test script

(using . on a BoldStars object produces a BoldStars object). If one of the arguments to join() is one such object, then it will "assimilate" the other arguments with each concatenation, and the result of join() will be an overloaded object

Yes I got it now. See above.

Yves

-- perl -Mre=debug -e "/just|another|perl|hacker/"

Jan 16 '23 12:01 demerphq

If it would be considered too surprising for join to suddenly be implemented in terms of an overloaded concat operator, what about adding a new overload key whereby classes can opt-in to such behaviour.

class My::Stringlike::Class;
use overload
  '.' => "concat",
  join_uses_concat => 1;

and then pp_join would only follow the concat overloading on any object that additionally sets that key, indicating it is opting in to such a mechanism.

Jan 25 '23 17:01 leonerd

and then pp_join would only follow the concat overloading on any object that additionally sets that key, indicating it is opting in to such a mechanism.

That feels backwards, or maybe I dont understand the suggestion properly. If I call join with a list of items, and only one of them has this this new type of overloading what would happen?

FWIW: I have no problem with this RFC if it is lexically scoped behavior.

{
   use feature "join_with_concat";
   my $joined = join $sep, @things;
}

But it has fairly deep implications some of which IMO are negative in terms of performance, and definitely would change which existing overload methods are used, so I think it needs to be guarded. I can totally understand that two reasonable people might see what join does in different ways, and i think we can offer users the ability to choose which they get, but I do think it has to be opt in.

Feb 18 '23 13:02 demerphq

That feels backwards, or maybe I dont understand the suggestion properly. If I call join with a list of items, and only one of them has this this new type of overloading what would happen?

Per instance. The pp_join function has to individually query every SV to ask "hey, do you have concat magic? if so I'll use it". The suggestion is that question becomes "Hey, do you have concat magic and is join permitted to use it?". That way, any class that doesn't specify the flag will continue in its current behaviour and not cause any surprises, meanwhile any class that does want to be join-aware can set the flag and all works lovely.

Perhaps I need to remind people of my use-case here. My use-case is that I have an entire ecosystem of object classes ("String::Tagged") that are objects that behave like strings. Or at least, they try to. But currently they can't do a very good job of it because many core operators (such as join() and substr()) don't permit overloading, so I lost my "being a special object" going via those operators. It means I have to write an entire second set of object methods - like the String::Tagged->join($sep, @values) constructor, or the $str->substr(...) accessor, which code should call instead. But then that means that only code that knows about them actually works. It means that I can't pass these strings into the huuuuuge amount of existing code already on CPAN and have it work. One random example module would be Text::Wrap. I tried years ago to use that with String::Tagged and was very disappointed at how badly we handle that kind of thing. It seems I'm not the only person - https://rt.cpan.org/Ticket/Display.html?id=145864

My eventual goal here is to end up in a situation where an object class such as String::Tagged gets as fully supported by all the core ops (and any CPAN module that uses them) as a number class like Math::BigRat. Observe that nobody has to call $x->add($y) to make those behave - you can just do $x + $y. I'd like our stringy ops to do the same.

Feb 18 '23 16:02 leonerd

Per instance. The pp_join function has to individually query every SV to ask "hey, do you have concat magic? if so I'll use it". The suggestion is that question becomes "Hey, do you have concat magic and is join permitted to use it?". That way, any class that doesn't specify the flag will continue in its current behaviour and not cause any surprises, meanwhile any class that does want to be join-aware can set the flag and all works lovely.

So in the following code what would happen?

my $s1 = Object::With::Normal::Concat::Overload->new(..);
my $s2 = Object::With::Join::Concat::Overload->new(...);

my $joined = join("", qw(a b c d e), $s1, qw( p q r s t u ), $s2, qw( w x y z));

Would $s2 being in the arg list cause $s1 to also use concat overloads? Would it be acceptable if the internals turned this into:

my $joined = join("", join("",qw(a b c d e), "$s1", qw( p q r s t u )), $s2, join("",qw( w x y z)));

for example?

Id feel more comfortable reasoning about this if we weren't discussing the "concat" overload. Lets say we had a "join" overload.

Would it be acceptable that a join with K items in it, where only 1 of which had the join overload, that it might call that overload at most twice and perhaps only once (if it was at the head or tail of the list), regardless of what K was?

If that was acceptable then I would be a lot more comfortable about this. My concern is that adding this has a deep impact on the expected order of the joining, if we eliminate that, then most of my concern gets resolved.

Feb 18 '23 16:02 demerphq

@book: I will resolve the embed.fnc conflict in this PR for you unless you leave a note requesting that I do not. Sorry for the inconvenience.

Feb 19 '23 13:02 demerphq

Rebased as promised.

Feb 20 '23 14:02 demerphq

There was extensive discussion in this p.r., but it petered out a year-and-a-half ago. Consequently, the code has acquired merge conflicts. @book et al., can we get an update on the status of this ticket? Thanks.

Aug 27 '24 20:08 jkeenan

perl5 perl5 copied to clipboard

Part 1 of implementing RFC0013 (join)

@.**** commented on this pull request.

perl5
perl5 copied to clipboard