perl5
perl5 copied to clipboard
Using signatures makes the debugger harder to use
Description
When using the debugger (perl -d $program), you can use the s command to step through lines of code, one at a time. However, this is not true for subroutine signatures. When you step through one of these subs, you have to step through the signature at N+1 times, where N is the number of variables in the signature. These extra steps are not obvious, because nothing visible is happening, and they can be confusing. More than once I've accidentally stepped over the first line of a sub/method because I hit "return" too many times on a signatures.
This makes using the already cumbersome debugger CLI even more cumbersone.
It does not affect subroutines without signatures, even if they have a prototype (since the latter is compile-time, not run time).
This appears to affect all versions of Perl which allow signatures, whether by default, or when using the experimental signatures feature.
Steps to Reproduce
Run the following program with perl -d:
#!/usr/bin/env perl
use v5.40.0;
sub no_signature { # we will not step into this line
say "No signature was used.";
}
sub with_prototype :prototype($$;$) { # we will not step into this line
say "This is a subroutine with a prototype.";
}
sub no_args() { # we will step into this line 1 time
say "No arguments were passed.";
}
sub one_arg($one) { # we will step into this line 2 times
say "One argument was passed: $one";
}
sub two_args ( $one, $two ) { # we will step into this line 3 times
say "Two arguments were passed: $one and $two";
}
sub three_args ( $one, $two, $three ) { # we will step into this line 4 times
say "Three arguments were passed: $one, $two, and $three";
}
sub four_args ( $one, $two, $three, $four ) { # we will step into this line 5 times
say
"Four arguments were passed: $one, $two, $three, and $four";
}
no_signature();
with_prototype( 'one', 'two' );
no_args();
one_arg('one');
two_args( 'one', 'two' );
three_args( 'one', 'two', 'three' );
four_args( 'one', 'two', 'three', 'four' );
Expected behavior
When running the above program with the debugger, when you try to step into the first sub, you immediately get into the body of the sub:
5 sub no_signature {
6==> say "No signature was used.";
7 }
I expect the debugger to never stop on the signature line, or at minimum, to only stop once (because we might need to see the defaults).
However, for all subs with signatures, you must repeat the s (or just hit "return" to reuse the last step command) N+1 times for every variable in the signature, even though it's not obvious to the programmer that runtime behavior is occurring here:
9==> sub empty_signature () {
10: say "Empty signature was used.";
11 }
However, you can verify runtime behavior if you print out the value of those variables after every step:
main::(subs.pl:43): two_args( 'one', 'two' );
DB<1> s
main::two_args(subs.pl:25): sub two_args ( $one, $two ) {
DB<1> x [$one,$two]
0 ARRAY(0x15385afe0)
0 undef
1 undef
DB<2> s
main::two_args(subs.pl:25): sub two_args ( $one, $two ) {
DB<2> x [$one,$two]
0 ARRAY(0x153074888)
0 undef
1 undef
DB<3> s
main::two_args(subs.pl:25): sub two_args ( $one, $two ) {
DB<3> x [$one,$two]
0 ARRAY(0x1530748b8)
0 'one'
1 undef
DB<4> s
main::two_args(subs.pl:25): sub two_args ( $one, $two ) {
DB<4> x [$one,$two]
0 ARRAY(0x15385b3e8)
0 'one'
1 'two'
DB<5> s
main::two_args(subs.pl:26): say "Two arguments were passed: $one and $two";
That being said, I've never felt the need to dump signature variables while they're being individually assigned.
Perl configuration
Summary of my perl5 (revision 5 version 40 subversion 0) configuration:
Platform:
osname=darwin
osvers=23.5.0
archname=darwin-2level
uname='darwin ovids-m1-laptop.local 23.5.0 darwin kernel version 23.5.0: wed may 1 20:12:58 pdt 2024; root:xnu-10063.121.3~5release_arm64_t6000 arm64 '
config_args='-de -Dprefix=/Users/ovid/perl5/perlbrew/perls/perl-5.40.0 -Aeval:scriptdir=/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/bin'
hint=recommended
useposix=true
d_sigaction=define
useithreads=undef
usemultiplicity=undef
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
Compiler:
cc='cc'
ccflags ='-fno-common -DPERL_DARWIN -mmacosx-version-min=14.5 -DNO_THREAD_SAFE_QUERYLOCALE -DNO_POSIX_2008_LOCALE -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -I/opt/local/include'
optimize='-O3'
cppflags='-fno-common -DPERL_DARWIN -mmacosx-version-min=14.5 -DNO_THREAD_SAFE_QUERYLOCALE -DNO_POSIX_2008_LOCALE -fno-strict-aliasing -pipe -fstack-protector-strong -I/usr/local/include -I/opt/local/include'
ccversion=''
gccversion='Apple LLVM 15.0.0 (clang-1500.3.9.4)'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=8
longdblkind=0
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='cc'
ldflags =' -mmacosx-version-min=14.5 -fstack-protector-strong -L/usr/local/lib -L/opt/local/lib'
libpth=/usr/local/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib/clang/15.0.0/lib /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/lib /opt/local/lib /usr/lib
libs=-lgdbm
perllibs=
libc=
so=dylib
useshrplib=false
libperl=libperl.a
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=bundle
d_dlsymun=undef
ccdlflags=' '
cccdlflags=' '
lddlflags=' -mmacosx-version-min=14.5 -bundle -undefined dynamic_lookup -L/usr/local/lib -L/opt/local/lib -fstack-protector-strong'
Characteristics of this binary (from libperl):
Compile-time options:
HAS_LONG_DOUBLE
HAS_STRTOLD
HAS_TIMES
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_HASH_FUNC_SIPHASH13
PERL_HASH_USE_SBOX32
PERL_MALLOC_WRAP
PERL_OP_PARENT
PERL_PRESERVE_IVUV
PERL_USE_SAFE_PUTENV
USE_64_BIT_ALL
USE_64_BIT_INT
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
Built under darwin
Compiled at Jul 2 2024 09:26:02
%ENV:
PERL5LIB="/Users/ovid/perlcustom/lib"
PERLBREW_HOME="/Users/ovid/.perlbrew"
PERLBREW_MANPATH="/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/man"
PERLBREW_PATH="/Users/ovid/perl5/perlbrew/bin:/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/bin"
PERLBREW_PERL="perl-5.40.0"
PERLBREW_ROOT="/Users/ovid/perl5/perlbrew"
PERLBREW_SHELLRC_VERSION="0.98"
PERLBREW_VERSION="0.98"
@INC:
/Users/ovid/perlcustom/lib
/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/lib/site_perl/5.40.0/darwin-2level
/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/lib/site_perl/5.40.0
/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/lib/5.40.0/darwin-2level
/Users/ovid/perl5/perlbrew/perls/perl-5.40.0/lib/5.40.0
state ops (nextstate, dbstate) are generated for each argument, which is what the debugger uses for stepping:
$ ./perl -Ilib -MO=Concise,f -e 'use v5.40.0; sub f ($x, $y) { print $x, $y }'
main::f:
a <1> leavesub[1 ref] K/REFC,1 ->(end)
- <@> lineseq KP ->a
- <1> ex-argcheck vK/1 ->7
- <@> lineseq vK ->-
1 <;> nextstate(main 6 -e:1) v:us,*,&,$,fea=8 ->2
2 <+> argcheck(2,0) v ->3
3 <;> nextstate(main 4 -e:1) v:us,*,&,$,fea=8 ->4
4 <+> argelem(0)[$x:4,6] v/SV ->5
5 <;> nextstate(main 5 -e:1) v:us,*,&,$,fea=8 ->6
6 <+> argelem(1)[$y:5,6] v/SV ->7
- <;> ex-nextstate(main 6 -e:1) v:us,*,&,$,fea=8 ->-
7 <;> nextstate(main 6 -e:1) v:us,*,&,$,fea=8 ->8
9 <@> print sK ->a
8 <0> padrange[$x:4,6; $y:5,6] /range=2 ->9
- <0> padsv[$x:4,6] s ->-
- <0> padsv[$y:5,6] s ->9
-e syntax OK
These appear to be deliberate, both sigscalarelem and sigslurpsigil in perly.y both add state ops.
@iabyn might know why.
On Tue, Sep 17, 2024 at 06:53:04PM -0700, Tony Cook wrote:
state ops (nextstate, dbstate) are generated for each argument, which is what the debugger uses for stepping:
These appear to be deliberate, both sigscalarelem and sigslurpsigil in perly.y both add state ops.
@iabyn might know why.
The original signature implementation (largely by Zefram) just planted a lot of standard perl ops to process signatures as if the signature handling had actually been written in perl. So for example:
sub f ($a, $b = 1) {}
deparsed as:
sub f { die sprintf("Too many arguments for subroutine at %s line %d.\n", (caller)[1, 2]) unless @_ <= 2; die sprintf("Too few arguments for subroutine at %s line %d.\n", (caller)[1, 2]) unless @_ >= 1; my $a = $[0]; my $b = @ >= 2 ? $_[1] : 1;
In that implementation, each item of parameter processing was separated by a nextstate op, which (among other things) ensured that the taint state was reset after each assignment (so one tainted arg wouldn't taint subsequent parameter values), and line numbers for warnings etc would be correct(ish).
When I added the sigFOO ops, I kept the nextstate ops: mainly in an attempt to preserve current behaviour as much as possible. I always intended that such sigFOO ops would be an intermediate step, and they would all be replaced by a single OP_SIGNATURE op in the peephole optimiser which would (via a mini state machine) do all the sig processing in a single op, with any taint resetting etc done by the OP_SIGNATURE, with no OP_NEXTSTATEs left.
I don't think I ever considered how the debugger would be affected.
-- Standards (n). Battle insignia or tribal totems.
While I was looking at how to implement named params, I had vague thoughts in mind for neatening up the signature things overall; much as Dave suggests there. In the longterm I don't imagine there'll be as many OP_NEXTSTATE (or DBSTATE) ops involved.
In the short-term, I wonder if we could try simply not emitting so many of them. Since conceptually every defaulting expression for optional params does execute like a statement, those ought to be retained, but there's probably no need to have one on the OP_ARGCHECK nor the mandatory params without a defaulting expression.
Failing that, there's likely a spare private bit on the OP_DBSTATE that we can use to signal to the debugger not to count that one, so as to preserve the tree-shaped structure for code that wants to inspect the optree, but allowing the debugger to skip over those.
There was some discussion in IRC covering this earlier:
- we need the state ops to make the lexicals visible as they're introduced
- the debugger only steps/breaks on dbstate ops, which are normally generated when perl is invoked with
-d, we can make the state op non-steppable/breakable by making then nextstate ops instead, rather than adding a private bit