perl5 add POSIX::atexit()

I wish to have a simple atexit() in POSIX because comparing to END blocks, atext() works better with fork().

POSIX::atexit() has been stay unimplemented as END {} blocks are the direct replacement. However, when it is desired to register different / child-only exit handlers after fork(), the behaviour of atexit() seems a bit more intuitive and friendlier.

To elaborate the differences more, consider this piece of example C code:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

void
bye(void)
{
  printf("bye\n");
}

void
bye2(void)
{
  printf("bye2\n");
}

void
bye3(void)
{
  printf("bye3\n");
}

void
register_atexit(void (*f)(void)) {
  if (atexit(f) != 0) {
    fprintf(stderr, "cannot set exit function\n");
    exit(EXIT_FAILURE);
  }
}

int
main(void)
{
  register_atexit(bye);

  if (fork() == 0) {
    register_atexit(bye2);
    register_atexit(bye3);
    printf("before exit (child)...\n");
  } else {
    sleep(1); /* just to let child print first */
    printf("before exit (parent)...\n");
  }

  return 0;
}

The output should be:

# ./a.out 
before exit (child)...
bye3
bye2
bye
before exit (parent)...
bye

The line "bye\n" is printed by both the parent process and the child process, because bye() is registered to atexit() before fork(), while bye2() and bye3() are only for child process.

Now, without knowing the behaviour of END block first, when a novice perl programmer translate that C code into perl code, the result might end up being this incorrect version:

use v5.38;

sub atexit_bye {
    END { say "bye" }
}

sub atexit_bye2 {
    END { say "bye2" }
}

sub atexit_bye3 {
    END { say "bye3" }
}

atexit_bye();

if (fork() == 0) {
    atexit_bye2();
    atexit_bye3();
    say "before exit (child)...";
} else {
    sleep 1;
    say "before exit (parent)...";
`}`

... it is incorrect because the output would be:

before exit (child)...
bye3
bye2
bye
before exit (parent)...
bye3
bye2
bye

And to implement the matching behaviour, the novice perl programmer would need to realize that all END blocks are just inherited after fork() and they need to use just one END blocks and manage the stack of exit-handlers instead.

While END blocks are easy enough for a non-forking program, atexit() can be useful for writing fork-ing programs. Hence this PR.

Pretty sure this implementation is a bit too simple to be comprehensive, neither the test cases. I appreciate reviews and suggestions.

Previously: https://github.com/Perl/perl5/issues/21253#issuecomment-1716925190

Jul 07 '24 15:07 gugod

To make the version tests pass, you'll need to bump the version of POSIX.pm.

You should also document the new atexit in POSIX.pod.

Jul 08 '24 17:07 mauke

I can see the use of this, but I'm not sure if POSIX is the right place for it.

Jul 08 '24 20:07 Leont

I can see the use of this, but I'm not sure if POSIX is the right place for it.

Why not? It seems like the natural place to me:

atexit is a POSIX function
POSIX.pm already exports atexit, albeit as an error-throwing stub
POSIX.pm has documentation for atexit (which currently recommends using END {} instead)
POSIX.pm has tests for atexit (which currently verify that it throws the expected error)

There is a natural extension point here. We just have to supply a real implementation of atexit.

And wouldn't it be weird if we provided atexit elsewhere while POSIX.pm still claims atexit is not implemented and END {} must be used instead?

Jul 08 '24 22:07 mauke

I've briefly described atexit() in POSIX.pod with commit https://github.com/Perl/perl5/pull/22383/commits/6bcb036c86e55843965cce364cef41f13c9895d3

There is another mention of atexit() in cpan/perlfaq/lib/perlfaq8.pod, but since that's under cpan/, I would not change it in this PR.

Jul 10 '24 14:07 gugod

On Mon, Jul 08, 2024 at 03:51:26PM -0700, mauke wrote:

I can see the use of this, but I'm not sure if POSIX is the right place for it.

Why not? It seems like the natural place to me:

Because the purpose of the POSIX module is to provide direct(ish) access to POSIX C library functions. But in this case, it is making no use of the actual atexit() C library function.

Calling it (for arguments sake) builtin::atexit() and in the docs explaining that it does for perl what atexit() does for C, makes sense.

Implying that the POSIX module is making the C library atexit() function callable when it actually isn't is confusing. To be clear, C-level atexit() functions are called extremely late in the process's lifecycle - at the point where, in the perl interpreter, nearly everything has been freed and it's no longer possible to call perl-level subroutines.

And wouldn't it be weird if we provided atexit elsewhere while POSIX.pm still claims atexit is not implemented and END {} must be used instead?

No, the document would just be updated to state that "you're probably looking for builtin::atexit()" and continue to croak if used.

-- This email is confidential, and now that you have read it you are legally obliged to shoot yourself. Or shoot a lawyer, if you prefer. If you have received this email in error, place it in its original wrapping and return for a full refund. By opening this email, you accept that Elvis lives.

Jul 12 '24 09:07 iabyn

Because the purpose of the POSIX module is to provide direct(ish) access to POSIX C library functions. But in this case, it is making no use of the actual atexit() C library function

Thanks for the insight. With this line of hint in mind, the documentation in POSIX.pod actually makes more sense to me (and why a big portion of the subroutines defined in POSIX are just stubs)

Implying that the POSIX module is making the C library atexit() function callable when it actually isn't is confusing. To be clear, C-level atexit() functions are called extremely late in the process's lifecycle - at the point where, in the perl interpreter, nearly everything has been freed and it's no longer possible to call perl-level subroutines.

(In the following text I'll try to make it less confusing by using POSIX:: prefix to mean the subroutine defined in POSIX.pm and when function names are written without this prefix, they should be a C function.)

Right. It wouldn't make sense to have a subroutine in perl to be just a simple proxy to the underlying atexit(), the C function, since, and that's why if there is a function that implements the mechanism similar to atexit(), the C function, it should probably utilize END {} block.

Although I wasn't actually implying that POSIX::atexit() calls atexit(), the C function, perhaps the context, the namespace POSIX, and the fact most of other subroutines in this namespace are proxies to their C-counterparts, implies such relation for me.

I would still try to defend this PR and state that the POSIX::atexit(), as implemented in this PR, is worth considering, because:

All other non-stub subroutines in POSIX.pm takes arguments and return value in perl's convention. Eg, POSIX::cos(30) and POSIX::cos("30") are the same things -- unlike cos() the C function.
It is the perl way not to differentiating int 30 vs string "30", not the C way -- so when we design POSIX::cos(), it makes sense to following the perl way and not to differentiate between 30 and "30"
POSIX::atexit() takes one argument that is a perl subroutine, not a C function pointer -- for obvious reasons this just cannot be the case. This means that it is actually not that different comparing to most of other subroutines in POSIX.pm. So does that POSIX::atexit() should do whatever it should be doing the perl way

So... that's my 2c about why, although not using atexit() the C function internally, we should consider adding this POSIX::atexit() subroutine.

Jul 12 '24 15:07 gugod

@gugod,

I reviewed this pull request today. While you've obviously put a lot of thought into it, I tend to agree with @iabyn that our POSIX module is not the location for atexit() functionality that is not based on the POSIX C library.

Because the purpose of the POSIX module is to provide direct(ish) access to POSIX C library functions. But in this case, it is making no use of the actual atexit() C library function.

...

Implying that the POSIX module is making the C library atexit() function callable when it actually isn't is confusing.

Would you consider creating an Issue (but not yet a pull request) calling for the creation of builtin::atexit() and let people comment therein?

Aug 27 '24 19:08 jkeenan

As a meta-comment, I think Perlish wrappers of POSIX-like functionality are more valuable to Perlish programs than direct POSIX function calls, in general; and such things should not involve the POSIX module (unless for constants)

Aug 27 '24 20:08 Grinnz

Thanks @jkeenan and @iabyn for your kind comments.

I'll close this PR for now and make an issues to first gather the comments about a built:atexit() instead.

Aug 31 '24 07:08 gugod

perl5 perl5 copied to clipboard

add POSIX::atexit()

perl5
perl5 copied to clipboard