io: add an `Err` field to `LimitedReader`
New Proposal
As per @ianlancetaylor's suggestion, this proposal is now to add an Err field to io.LimitedReader. If not nil, this field's value will be returned when trying to read past the limit imposed by the reader instead of the default io.EOF.
Original Proposal
For various reasons, I had a need to use MaxBytesReader() in a situation where a ResponseWriter wasn't easily available, so I looked through the code to see if it was safe to pass nil and found that all it does is check a type assertion and then call an unexported method on it:
type requestTooLarger interface {
requestTooLarge()
}
if res, ok := l.w.(requestTooLarger); ok {
res.requestTooLarge()
}
While it's true that because of this it's safe to pass it a nil ResponseWriter, it feels quite odd to rely on undocumentated behavior of this kind.
Proposal
Several ways to clean this situation up come to mind:
- Add an alternative in the
httppackage that doesn't need aResponseWriter. Because of how it works, it could actually just return the same implementation, but the function will be more obvious in its usage. - Document how the implementation uses the
ResponseWriterand explicitly state that anilResponseWriteris valid. - Add a reimplementation of
MaxBytesReader()iniothat doesn't use aResponseWriterat all and isn't tied tohttpbehavior, but includes the other differences. This could actually be done, for the most part, by just adding anErr errorfield toio.LimitedReaderthat, if non-nil, is the error returned when the limit is reached instead ofio.EOF.
In whichever case, the documentation for MaxBytesReader() should probably be filled in a bit. It is quite odd that something called Reader requires a type that is primarily an io.Writer implementation for non-obvious reasons and doesn't mention it anywhere in the documentation.
I have forked io.LimitedReader to return a custom error at least three times.
If the proposed function doesn't use http.ResponseWriter, then I can't see any reason that it should live in the net/http package. That seems to reduce this to
type MyLimitedReader struct {
r io.LimitedReader
err error
}
func (mlr *MyLimitedReader) Read(p []byte) (int, error) {
n, err := mlr.Read(p)
if err == io.EOF {
err = mlr.err
}
return n, er
}
Is that correct?
If so, perhaps this proposal should be rewritten to add an Err field io.LimitedReader.
I like the new proposal. Returning io.EOF makes io.LimitedReader pretty unhelpful. But I think it would also be good to document that http.MaxBytesReader accepts nil ResponseWriter.
I have forked
io.LimitedReaderto return a custom error at least three times.
FWIW, my moreio.LimitedWriter has an Err field too:
Can someone tag this proposal as under review?
As an alternative to defining your own error, I just created a package and export the error so the caller can check against it.
// Throws an error if the reader is bigger than limit.
var ErrSizeExceeded = errors.New("stream size exceeded")
type MaxBytesReader struct {
io.ReadCloser // reader object
N int64 // max bytes remaining.
}
func NewMaxBytesReader(r io.ReadCloser, limit int64) *MaxBytesReader {
return &MaxBytesReader{r, limit}
}
func (b *MaxBytesReader) Read(p []byte) (n int, err error) {
if b.N <= 0 {
return 0, ErrSizeExceeded
}
if int64(len(p)) > b.N {
p = p[0:b.N]
}
n, err = b.ReadCloser.Read(p)
b.N -= int64(n)
return
}
~That would break current users of io.LimitReader/LimitedReader.~
I misunderstood the proposal.
@carlmjohnson It is already so tagged. We'll get to it soon, I think.
Change https://go.dev/cl/396215 mentions this issue: io: add an Err field to LimitedReader
This seems reasonable. Thanks for the proposal.
This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group
Does anyone object to adding this?
Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group
No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group
@carlmjohnson @ianlancetaylor
A problem with the solution that landed (CL 538060) is that the non-EOF error gets triggered when the amount of data is exactly the same as the limit.
https://go.dev/play/p/-EBfn2nF5FT?v=gotip
If I want to impose a limit of 1KB, I think it's fine to be able to copy a 1KB payload without encountering a "limit exceeded" error, but that's not the case with the current code.
Thanks. Working on it.
Seemed like a natural job for fuzzing, but I couldn't figure out how to make fuzzing work with the standard library, so I did it in a separate module and just copied some interesting cases back.
Change https://go.dev/cl/405854 mentions this issue: io: fix for LimitedReader sentinel when N == stream length
@rogpeppe, I'm looking at your example again, but I don't really understand why it's a problem with LimitedReader, given that:
io.Readerdoesn't provide a general mechanism for indicating "at EOF”, andstrings.Readerseems to prefer not to returnio.EOFwhen it technically could (https://go.dev/play/p/6F5tnjDaQre?v=gotip).
I think if the caller really wants to get EOF instead of the LimtedReader.Err when either error is valid, they ought to be using a Reader that returns the EOF as soon as it is reached instead of deferring it for the next call.
(Separately, perhaps we should file a proposal to change strings.Reader to do so, since its documentation promises only that it “implements the io.Reader interface” which allows either.)
@bcmills the point I'm trying to make is that when the limit isn't exceeded, we shouldn't see an error from io.Copy at all.
In my eyes, it makes most sense for the error to be returned when the limit is exceeded, not when it's reached. Of course, with the old LimitReader, there is no distinction between the two cases when the number of bytes exactly equals the limit, but with the new error there is.
Options:
- Drop the new sentinel behavior as unworkable.
- Leave things as is, document weirdness around N == Reader len.
- New behavior (in the CL) of giving the Reader one last chance to return EOF.
Any other ideas?
The more I think about it, the more I think that the original goal isn't nicely attainable just by adding an Err field to the LimitedReader struct.
LimitReader has the property that it will read at most n bytes from the underlying reader. The originally stated goal of this proposal is to return a special error (requestTooLarge) when the limit is exceeded, implying AFAICS that when we're exactly at the limit, that error would not be returned. Unfortunately those two cases aren't possible to distinguish without reading an extra byte beyond the limit to find out whether the data limit really has been exceeded, which will break the above-stated property.
One possibility might be to document that an extra byte will be read when Err is non-nil, precisely to distinguish the two cases.
Another might be to add another entry point rather than modifying LimitedReader. For the record, here's the code for a LimitReader version I've been involved with in the past: https://go.dev/play/p/xqH4fdMPONa. We could add a LimitReaderWithError function, for example.
@rogpeppe, when the limit isn't exceeded, we don't see an error from io.CopyN:
https://go.dev/play/p/gqEn3aREOJE?v=gotip
That is: if you don't try to read past the limit at all, then you don't get the “limit exceeded” error.
There is no general way to force a reader to tell you that it is at EOF other than trying to read a nonzero number of bytes. (A 0-length read could legitimately return 0, nil, or perhaps 0, io.ErrShortBuffer, even if the Reader is at EOF.)
But trying to read a nonzero number of bytes beyond the limit is exactly what LimitedReader prevents.
That is: if you don't try to read past the limit at all, then you don't get the “limit exceeded” error.
That's fine AFAICS. In that case, the limit hasn't been exceeded so (to use the original use case) returning a "request too large" error would not be appropriate.
I think for that use-case, if you want to accept a request of up to N bytes then you'd want to set the LimitedReader's initial limit to N+1. Then you check the n returned from io.Copy against the actual limit.
(But I guess at that point it doesn't really matter what you set the Err field to, because all the information you need comes from the Copy call. 🤔)
Then you check the n returned from io.Copy against the actual limit
But that means that you can exceed the limit when copying, which surely isn't great?
FWIW, I find the naming around io.LimitReader vs io.LimitedReader confusing (it makes sense if I think about it, verb vs noun, but I have to think), so I would be happy with a new name. MaxBytesReader(r Reader, limit int64, err error) Reader?
Then you check the n returned from io.Copy against the actual limit
But that means that you can exceed the limit when copying, which surely isn't great?
...and that is exactly why I have a LimitedWriter. 😉
MaxBytesReader sounds exactly like the behaviour that LimitReader currently implements, I think (the whole point would be that it doesn't restrict the number of bytes read from the underlying reader to exactly limit).
LimitErrorReader ?
I don't see how it makes sense to use io.Copy with io.LimitedReader with a specified error. io.Copy copies until io.EOF is returned. io.LimitedReader with a specified error does not return io.EOF.