go icon indicating copy to clipboard operation
go copied to clipboard

io: add an `Err` field to `LimitedReader`

Open DeedleFake opened this issue 3 years ago • 64 comments

New Proposal

As per @ianlancetaylor's suggestion, this proposal is now to add an Err field to io.LimitedReader. If not nil, this field's value will be returned when trying to read past the limit imposed by the reader instead of the default io.EOF.

Original Proposal

For various reasons, I had a need to use MaxBytesReader() in a situation where a ResponseWriter wasn't easily available, so I looked through the code to see if it was safe to pass nil and found that all it does is check a type assertion and then call an unexported method on it:

type requestTooLarger interface {
	requestTooLarge()
}
if res, ok := l.w.(requestTooLarger); ok {
	res.requestTooLarge()
}

While it's true that because of this it's safe to pass it a nil ResponseWriter, it feels quite odd to rely on undocumentated behavior of this kind.

Proposal

Several ways to clean this situation up come to mind:

  1. Add an alternative in the http package that doesn't need a ResponseWriter. Because of how it works, it could actually just return the same implementation, but the function will be more obvious in its usage.
  2. Document how the implementation uses the ResponseWriter and explicitly state that a nil ResponseWriter is valid.
  3. Add a reimplementation of MaxBytesReader() in io that doesn't use a ResponseWriter at all and isn't tied to http behavior, but includes the other differences. This could actually be done, for the most part, by just adding an Err error field to io.LimitedReader that, if non-nil, is the error returned when the limit is reached instead of io.EOF.

In whichever case, the documentation for MaxBytesReader() should probably be filled in a bit. It is quite odd that something called Reader requires a type that is primarily an io.Writer implementation for non-obvious reasons and doesn't mention it anywhere in the documentation.

DeedleFake avatar Feb 09 '22 20:02 DeedleFake

I have forked io.LimitedReader to return a custom error at least three times.

jimmyfrasche avatar Feb 09 '22 20:02 jimmyfrasche

If the proposed function doesn't use http.ResponseWriter, then I can't see any reason that it should live in the net/http package. That seems to reduce this to

type MyLimitedReader struct {
    r   io.LimitedReader
    err error
}

func (mlr *MyLimitedReader) Read(p []byte) (int, error) {
    n, err := mlr.Read(p)
    if err == io.EOF {
        err = mlr.err
    }
    return n, er
}

Is that correct?

If so, perhaps this proposal should be rewritten to add an Err field io.LimitedReader.

ianlancetaylor avatar Feb 09 '22 21:02 ianlancetaylor

I like the new proposal. Returning io.EOF makes io.LimitedReader pretty unhelpful. But I think it would also be good to document that http.MaxBytesReader accepts nil ResponseWriter.

earthboundkid avatar Feb 10 '22 16:02 earthboundkid

I have forked io.LimitedReader to return a custom error at least three times.

FWIW, my moreio.LimitedWriter has an Err field too:

bcmills avatar Feb 11 '22 15:02 bcmills

Can someone tag this proposal as under review?

earthboundkid avatar Mar 26 '22 19:03 earthboundkid

As an alternative to defining your own error, I just created a package and export the error so the caller can check against it.

// Throws an error if the reader is bigger than limit.
var ErrSizeExceeded = errors.New("stream size exceeded")

type MaxBytesReader struct {
	io.ReadCloser       // reader object
	N             int64 // max bytes remaining.
}

func NewMaxBytesReader(r io.ReadCloser, limit int64) *MaxBytesReader {
	return &MaxBytesReader{r, limit}
}

func (b *MaxBytesReader) Read(p []byte) (n int, err error) {
	if b.N <= 0 {
		return 0, ErrSizeExceeded
	}

	if int64(len(p)) > b.N {
		p = p[0:b.N]
	}

	n, err = b.ReadCloser.Read(p)
	b.N -= int64(n)
	return
}

xeoncross avatar Mar 26 '22 19:03 xeoncross

~That would break current users of io.LimitReader/LimitedReader.~

I misunderstood the proposal.

earthboundkid avatar Mar 26 '22 23:03 earthboundkid

@carlmjohnson It is already so tagged. We'll get to it soon, I think.

ianlancetaylor avatar Mar 26 '22 23:03 ianlancetaylor

Change https://go.dev/cl/396215 mentions this issue: io: add an Err field to LimitedReader

gopherbot avatar Mar 28 '22 18:03 gopherbot

This seems reasonable. Thanks for the proposal.

rsc avatar Mar 30 '22 17:03 rsc

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

rsc avatar Mar 30 '22 18:03 rsc

Does anyone object to adding this?

rsc avatar Apr 06 '22 17:04 rsc

Based on the discussion above, this proposal seems like a likely accept. — rsc for the proposal review group

rsc avatar Apr 13 '22 18:04 rsc

No change in consensus, so accepted. 🎉 This issue now tracks the work of implementing the proposal. — rsc for the proposal review group

rsc avatar May 04 '22 18:05 rsc

@carlmjohnson @ianlancetaylor

A problem with the solution that landed (CL 538060) is that the non-EOF error gets triggered when the amount of data is exactly the same as the limit.

https://go.dev/play/p/-EBfn2nF5FT?v=gotip

If I want to impose a limit of 1KB, I think it's fine to be able to copy a 1KB payload without encountering a "limit exceeded" error, but that's not the case with the current code.

rogpeppe avatar May 12 '22 13:05 rogpeppe

Thanks. Working on it.

earthboundkid avatar May 12 '22 13:05 earthboundkid

Seemed like a natural job for fuzzing, but I couldn't figure out how to make fuzzing work with the standard library, so I did it in a separate module and just copied some interesting cases back.

earthboundkid avatar May 12 '22 14:05 earthboundkid

Change https://go.dev/cl/405854 mentions this issue: io: fix for LimitedReader sentinel when N == stream length

gopherbot avatar May 12 '22 14:05 gopherbot

@rogpeppe, I'm looking at your example again, but I don't really understand why it's a problem with LimitedReader, given that:

  1. io.Reader doesn't provide a general mechanism for indicating "at EOF”, and
  2. strings.Reader seems to prefer not to return io.EOF when it technically could (https://go.dev/play/p/6F5tnjDaQre?v=gotip).

I think if the caller really wants to get EOF instead of the LimtedReader.Err when either error is valid, they ought to be using a Reader that returns the EOF as soon as it is reached instead of deferring it for the next call. (Separately, perhaps we should file a proposal to change strings.Reader to do so, since its documentation promises only that it “implements the io.Reader interface” which allows either.)

bcmills avatar May 12 '22 15:05 bcmills

@bcmills the point I'm trying to make is that when the limit isn't exceeded, we shouldn't see an error from io.Copy at all.

In my eyes, it makes most sense for the error to be returned when the limit is exceeded, not when it's reached. Of course, with the old LimitReader, there is no distinction between the two cases when the number of bytes exactly equals the limit, but with the new error there is.

rogpeppe avatar May 12 '22 15:05 rogpeppe

Options:

  1. Drop the new sentinel behavior as unworkable.
  2. Leave things as is, document weirdness around N == Reader len.
  3. New behavior (in the CL) of giving the Reader one last chance to return EOF.

Any other ideas?

earthboundkid avatar May 12 '22 15:05 earthboundkid

The more I think about it, the more I think that the original goal isn't nicely attainable just by adding an Err field to the LimitedReader struct.

LimitReader has the property that it will read at most n bytes from the underlying reader. The originally stated goal of this proposal is to return a special error (requestTooLarge) when the limit is exceeded, implying AFAICS that when we're exactly at the limit, that error would not be returned. Unfortunately those two cases aren't possible to distinguish without reading an extra byte beyond the limit to find out whether the data limit really has been exceeded, which will break the above-stated property.

One possibility might be to document that an extra byte will be read when Err is non-nil, precisely to distinguish the two cases.

Another might be to add another entry point rather than modifying LimitedReader. For the record, here's the code for a LimitReader version I've been involved with in the past: https://go.dev/play/p/xqH4fdMPONa. We could add a LimitReaderWithError function, for example.

rogpeppe avatar May 12 '22 15:05 rogpeppe

@rogpeppe, when the limit isn't exceeded, we don't see an error from io.CopyN: https://go.dev/play/p/gqEn3aREOJE?v=gotip That is: if you don't try to read past the limit at all, then you don't get the “limit exceeded” error.

There is no general way to force a reader to tell you that it is at EOF other than trying to read a nonzero number of bytes. (A 0-length read could legitimately return 0, nil, or perhaps 0, io.ErrShortBuffer, even if the Reader is at EOF.) But trying to read a nonzero number of bytes beyond the limit is exactly what LimitedReader prevents.

bcmills avatar May 12 '22 15:05 bcmills

That is: if you don't try to read past the limit at all, then you don't get the “limit exceeded” error.

That's fine AFAICS. In that case, the limit hasn't been exceeded so (to use the original use case) returning a "request too large" error would not be appropriate.

rogpeppe avatar May 12 '22 15:05 rogpeppe

I think for that use-case, if you want to accept a request of up to N bytes then you'd want to set the LimitedReader's initial limit to N+1. Then you check the n returned from io.Copy against the actual limit.

(But I guess at that point it doesn't really matter what you set the Err field to, because all the information you need comes from the Copy call. 🤔)

bcmills avatar May 12 '22 16:05 bcmills

Then you check the n returned from io.Copy against the actual limit

But that means that you can exceed the limit when copying, which surely isn't great?

rogpeppe avatar May 12 '22 16:05 rogpeppe

FWIW, I find the naming around io.LimitReader vs io.LimitedReader confusing (it makes sense if I think about it, verb vs noun, but I have to think), so I would be happy with a new name. MaxBytesReader(r Reader, limit int64, err error) Reader?

earthboundkid avatar May 12 '22 16:05 earthboundkid

Then you check the n returned from io.Copy against the actual limit

But that means that you can exceed the limit when copying, which surely isn't great?

...and that is exactly why I have a LimitedWriter. 😉

bcmills avatar May 12 '22 16:05 bcmills

MaxBytesReader sounds exactly like the behaviour that LimitReader currently implements, I think (the whole point would be that it doesn't restrict the number of bytes read from the underlying reader to exactly limit).

LimitErrorReader ?

rogpeppe avatar May 12 '22 16:05 rogpeppe

I don't see how it makes sense to use io.Copy with io.LimitedReader with a specified error. io.Copy copies until io.EOF is returned. io.LimitedReader with a specified error does not return io.EOF.

ianlancetaylor avatar May 12 '22 22:05 ianlancetaylor