dbdpg icon indicating copy to clipboard operation
dbdpg copied to clipboard

Hard to trigger crash with async queries and InactiveDestroy

Open johto opened this issue 7 years ago • 1 comments

Hi,

I unfortunately can't share the test case since it uses thousands of lines of proprietary code, but it does roughly this:

use DBI;
use DBD::Pg ':async';
use Time::HiRes;

sub mayhem {
    my $dbh = DBI->connect("dbi:Pg:", '', '', {pg_enable_utf8 => 1, PrintError => 0, RaiseError => 1, pg_server_prepare => 0});
    my $sth = $dbh->prepare(q{SELECT CASE WHEN pg_sleep(random() * 0.25) IS NOT NULL THEN random()::text::uuid END}, {pg_async => PG_ASYNC});
    $dbh->{InactiveDestroy} = 1;
    $sth->execute();
    for (my $i = 0; ; ++$i) {
        if ($sth->pg_ready) {
            last;
        }
        if ($i >= 100) {
            print STDERR "wait timeout\n";
            $sth->cancel();
            exit(1);
        }
        Time::HiRes::sleep(0.05);
    }
    eval {
        $sth->pg_result();
    };
}
mayhem();

which sometimes crashes with a backtrace like this:

(gdb) bt
#0  0x00007ffff61c40fb in PQclear () from /usr/lib/x86_64-linux-gnu/libpq.so.5
#1  0x00007ffff64107f8 in pg_db_destroy (dbh=0x1d1bc40, imp_dbh=0x1d22570) at dbdimp.c:688
#2  0x00007ffff6407b54 in XS_DBD__Pg__db_DESTROY (my_perl=0x603010, cv=<optimized out>) at ./Pg.xsi:390
#3  0x00007ffff664a656 in XS_DBI_dispatch (my_perl=<optimized out>, cv=0x8c0268) at DBI.xs:3797
#4  0x00007ffff7b0ad26 in Perl_pp_entersub () from /usr/lib/libperl.so.5.18
#5  0x00007ffff7a94846 in Perl_call_sv () from /usr/lib/libperl.so.5.18
#6  0x00007ffff7b13d71 in ?? () from /usr/lib/libperl.so.5.18
#7  0x00007ffff7b14500 in Perl_sv_clear () from /usr/lib/libperl.so.5.18
#8  0x00007ffff7b14b8d in Perl_sv_free2 () from /usr/lib/libperl.so.5.18
#9  0x00007ffff7aecf48 in ?? () from /usr/lib/libperl.so.5.18
#10 0x00007ffff7aed731 in Perl_mg_free () from /usr/lib/libperl.so.5.18
#11 0x00007ffff7b1479b in Perl_sv_clear () from /usr/lib/libperl.so.5.18
#12 0x00007ffff7b14b8d in Perl_sv_free2 () from /usr/lib/libperl.so.5.18
#13 0x00007ffff7ac2938 in Perl_cv_undef () from /usr/lib/libperl.so.5.18
#14 0x00007ffff7b1473c in Perl_sv_clear () from /usr/lib/libperl.so.5.18
#15 0x00007ffff7b14b8d in Perl_sv_free2 () from /usr/lib/libperl.so.5.18
#16 0x00007ffff7b36698 in Perl_free_tmps () from /usr/lib/libperl.so.5.18
#17 0x00007ffff7b03a35 in Perl_pp_nextstate () from /usr/lib/libperl.so.5.18
#18 0x00007ffff7b03346 in Perl_runops_standard () from /usr/lib/libperl.so.5.18
#19 0x00007ffff7a9bb24 in perl_run () from /usr/lib/libperl.so.5.18
#20 0x0000000000400dd9 in main ()

If I reset InactiveDestroy back to 0 before exiting the subroutine, the problem goes away.

My hypothesis is that the culprit is this code in dbd_st_destroy:

   /* If the InactiveDestroy flag has been set, we go no further */
   if (DBIc_IADESTROY(imp_dbh)) {
       if (TRACE4_slow) {
           TRC(DBILOGFP, "%sskipping sth destroy due to InactiveDestroy\n", THEADER_slow);
       }
       fprintf(stderr, "%x\n", imp_sth);
       DBIc_IMPSET_off(imp_sth); /* let DBI know we've done it */
       if (TEND_slow) TRC(DBILOGFP, "%sEnd dbd_st_destroy (InactiveDestroy set)\n", THEADER_slow);
       return;
   }

which DBIc_IMPSET_off()s the imp_sth still assigned to imp_dbh->async_sth. Later on, when dbd_db_destroy is called, its imp_dbh->async_sth points to freed memory containing garbage and the PQclear() call crashes. My digging around with gdb supports this hypothesis: imp_dbh->async_sth still has a pointer to the same location that previously held a valid sth, but *imp_dbh->async_sth is just garbage.

I've verified this against the master branches of both DBI and DBD::Pg, but this problem seems to have existed for a long time.

johto avatar Apr 20 '18 16:04 johto

We recently cleared up some problems with PQclear - could you see if a newer DBD::Pg clears it up for you? Anything >= version 3.10.3

turnstep avatar Jan 21 '20 20:01 turnstep