perl5 icon indicating copy to clipboard operation
perl5 copied to clipboard

Consistent I/O function error return values

Open ap opened this issue 7 months ago • 7 comments

We (PSC) have been thinking about the situation created by #20103, with problems reported in #22883 and #23026. A small improvement was made in #22907, and @tonycoz also suggested what he calls an alternative fix (I’ll come to that in a moment) in #23314.

Initially we had thoughts that I posted about, but we now think we were wrong at the time.

We now believe the situation is this:

  • The core problem is the fact that I/O functions derive their return code from the filehandle error flag. Instead, they should signal an error only based on the I/O operation they themselves performed, ignoring whether the error flag was already set.
  • Back when readline was clearing the error flag indiscriminately, the resulting overall behavior tended to incidentally match what would happen if this is how the I/O functions worked, even though they didn’t.
  • When we changed readline to actually be able to signal failures, this charade broke down, and now we are dealing with the fallout from that. The design was broken but held together with chewing gum, and we removed the chewing gum.

As for the error flag, clearly the point of the flag is to preserve error states until they have been dealt with, which that proposal contradicts. Effectively, the error flag exists to prevent data loss in code that is somewhat sloppy (but not too sloppy) in dealing with I/O errors. And because it is a documented interface, we cannot remove it or alter its behavior. Therefore, I/O functions should continue to set and clear it exactly as before, so that user code that was written to examine the flag will continue to behave as it always has. But the functions should entirely ignore the state of the flag in their own operation.

This means we want both #22907 and #23314, plus of course a number of addtional yet-to-be-written changes.

The question now is whether the design of PerlIO even allows for implementing this.

As @tonycoz said in #22909,

Writing tests for this encountered a problem: read() uses the stream error flag to detect whether an error occurred, the underlying PerlIO_read() behaves like the stdio fread() in that you tell the difference between an error and a short read/eof by checking the stream error or eof flag.

So read() cannot return undef on error unless we do set the error flag. 🙁

I’ll ping @Leont here for another opinion, given he replied that

PerlIO was designed to be an drop-in stdio replacement, instead of being a sensible buffered IO implementation. This is the source of almost all issues in PerlIO.

(Not knowing anything much about the area, I wonder if the envisaged abolition of stdio is useful or even necessary or maybe irrelevant to a bid to do this.)

ap avatar May 30 '25 11:05 ap

@haarg In writing this down, it occurs to me that this proposal would amount to undoing the effect of the change to readline, since readline has no other way of signaling errors besides the error flag, and the error flag would no longer be paid any attention to by core functions.

I am still convinced this is the right general direction, but maybe further thought is still needed.

ap avatar May 30 '25 11:05 ap

The core problem is the fact that I/O functions derive their return code from the filehandle error flag. Instead, they should signal an error only based on the I/O operation they themselves performed, ignoring whether the error flag was already set.

The problem is that a buffered I/O doesn't necessarily map to an actual system I/O. It may map to 0, 1, or more syscalls. We can't get rid of the concept of error flags, even if we can change some of the details.

Leont avatar May 30 '25 11:05 Leont

This means we want both #22907 and #23314, plus of course a number of addtional yet-to-be-written changes.

Could I ask what the implications of this are for the scheduling of the release of perl-5.42.0?

jkeenan avatar May 30 '25 11:05 jkeenan

We aren't intending on making any additional changes before 5.42.

The problem is that a buffered I/O doesn't necessarily map to an actual system I/O. It may map to 0, 1, or more syscalls. We can't get rid of the concept of error flags, even if we can change some of the details.

The functions may need to track an error flag, but that would only be for the lifetime of the single operation. I don't see why it would need to be exposed or persist between calls. This may be where the details of PerlIO make this impossible.

@haarg In writing this down, it occurs to me that this proposal would amount to undoing the effect of the change to readline, since readline has no other way of signaling errors besides the error flag, and the error flag would no longer be paid any attention to by core functions.

I'm not sure that I follow. Isn't our proposal here that readline should be able to signal failures without relying on a per-handle error flag?

haarg avatar May 30 '25 13:05 haarg

The core problem is the fact that I/O functions derive their return code from the filehandle error flag. Instead, they should signal an error only based on the I/O operation they themselves performed, ignoring whether the error flag was already set.

The problem is that a buffered I/O doesn't necessarily map to an actual system I/O. It may map to 0, 1, or more syscalls. We can't get rid of the concept of error flags, even if we can change some of the details.

PerlIO is bug compatible with LibC's stdio API falls apart super quick with PerlIO objects whose ultimate backend is $SCALAR or mmap().

What is the fileno() of a SV* of type SVt_PVNV with only NOK_on()?

Rewriting unit tests is not allowed.

bulk88 avatar May 31 '25 13:05 bulk88

@haarg In writing this down, it occurs to me that this proposal would amount to undoing the effect of the change to readline, since readline has no other way of signaling errors besides the error flag, and the error flag would no longer be paid any attention to by core functions.

I'm not sure that I follow. Isn't our proposal here that readline should be able to signal failures without relying on a per-handle error flag?

I mean that #20103 made sloppy code more robust, in the sense that code which doesn’t carefully handle $! while calling readline will now at least still encounter any readline errors when it calls close (whose errors are checked by a lot more code than errors from readline).

On one hand, if we now make the return value of close ignore the error flag, then this improvement in robustness for such sloppy code is lost again.

OTOH, this is balanced against an improvement of the situation for careful code that was already trying to do everything correctly.

I’m not bringing this up as a definite argument against pursuing this plan, only a potential one – mainly as an observation to be conscious about as we make any further calls about what to do.

ap avatar May 31 '25 14:05 ap

How about adding @! to https://perldoc.perl.org/perlvar ?

or maybe local( $! ) = sub { $self = shift; foreach(@_) { warn $_; } }; # sub handler ?

or maybe $!->() = sub { $self = shift; foreach(@_) { warn $_; } }; # sub handler ?

or this, the CV* sticks around!

C:\sources>perl -E "$! = sub { return 'abc'; }; say( ${\$!}+0 );  say( ${\$!}+0 );"
30611444
30611444

or maybe

C:\sources>perl -E "*{!} = sub { return 'abc_err'; }; say( &! );"
abc_err

bulk88 avatar Jun 06 '25 19:06 bulk88

I think there's a base problem: all errors set the error flag, even if they have no effect on data integrity.

(this all applies to buffered I/O: readline, read, getc, print, write, not to recv, sysread, syswrite)

Examples of such "harmless" errors include reading from a write-only handle (#23026), reading from a non-blocking handle that isn't ready for read (EAGAIN), reading from a directory handle, writing to a read-only handle. These are all arguably programming errors, though we could argue about read EAGAIN[1].

Errors that can result in loss of data include out of space when writing, writing to a non-blocking handle that isn't ready for write, writing to a pipe where the remote has closed, read/write I/O errors (such as a failed disk I/O or failed network filesystem).

The various write errors can lose data since print() doesn't return a written count, so if only a partial write was done the caller has no way to tell how much was successfully written.

One problem I see with trying to detect the difference between such errors is inconsistent error reporting between platforms, we can look at the Linux man pages or POSIX to see which errors are "harmless" on Linux and POSIX-alikes, but on non-POSIX systems (hello Windows) the errors reported are going to be less consistent.

So we could stop setting the error flag for such harmless errors, of course CPAN/DarkPAN code that tests for such errors (perhaps wrapping handles) may break.

The other side of this is that some ops use the error flag to decide whether to report an error, but I'll write about that next week.

[1] mostly because select() and buffered I/O aren't friends, as perldoc -f select says:

WARNING: One should not attempt to mix buffered I/O (like read or readline) with select, except as permitted by POSIX, and even then only on POSIX systems. You have to use sysread instead.

tonycoz avatar Jul 03 '25 06:07 tonycoz

It's been a bit more than a week (yay colds), but that time has changed my thinking a bit.

My original idea was to modify the PerlIO API, adding an alternative PerlIO_read()* entry point that would return -1 on error instead of zero, but this would silently change the behaviour for all CPAN users of PerlIO_read()† possibly leading to bugs.

So I think if we're going to change this it needs to be done at the level of the readline(), read(), getc(), print(), write() ops by checking for a change in the error flag and if it goes from unset to set, clear the error flag if errno happens to match one of the errors we consider harmless.

Note the ops would still indicate an error, but the error flag would no longer be set after the op returns.

* horribly enough, perlapio claims PerlIO_read() returns a negative number on error, but the implementation of the read() op in pp_sysread notes that it doesn't, and the implementation doesn't appear to (except for PerlIOUnix_read(), yay).

† a quick scan through a CPAN grep shows most code checks the result against the number of bytes expected, or < 1 or > 0 which should be safe, but I didn't check very hard

tonycoz avatar Jul 24 '25 05:07 tonycoz