streams
streams copied to clipboard
Add ReadableStreamBYOBReader.read(view, { min })
Implements #1143 and #1175.
In my first attempt, I did most of the heavy lifting in a ReadableStreamBYOBReaderReadFully() helper that calls read(view). If the first read didn't fill the entire view yet, it immediately calls read(view) again but with a special "addToFront" flag so that the new read-into request goes to the front of the queue rather than the back. This works fine, but it's not very pretty. It also means we're possibly calling read(view) re-entrantly from within ReadIntoRequest.chunkSteps(), which is somewhat pushing the boundaries of what you can reasonably expect from the spec. π¬ (I've kept this attempt in the commit history, so you can judge for yourselves.)
I then realized that we already have logic that keeps a read-into request in the queue after a respond(). When the user performs a BYOB read with a multi-byte typed array, the stream only fulfills the request when at least elementSize have been written into the pull-into descriptor. If not, the request stays pending and the pull-into descriptor remains at the head of the queue. Therefore, in my second attempt, I add a special type of pull-into descriptor which must stay in the queue as long as bytesFilled < byteLength, rather than the usual bytesFilled < elementSize. This also works very well, but requires changes across multiple parts of the code.
In #1175, we realized that it would be more powerful to allow reading at least a given number of bytes, rather than reading an exact number of bytes. Thus, in my third attempt, every pull-into descriptor now has a "minimum fill" slot that tracks how many bytes must be filled before this descriptor can fulfill its associated read(view) request. As a result, the bytesFilled < elementSize and bytesFilled < byteLength conditions from the previous solution are now replaced with a single bytesFilled < minimumFill condition. This greatly simplifies the logic.
To do:
- [ ] Decide upon the final API
- Current proposal:
byobReader.read(view, { min })(was:{ atLeast }) - Do we still want
byobReader.fill(view)too? This would basically be a shorthand forread()wheremin = view.length
- Current proposal:
- [x] Write (many more) tests to prove that this actually handles all possible edge cases
- [x] Write the specification text once we've decided on the approach
- [ ] At least two implementers are interested (and none opposed):
- [x] Tests are written and can be reviewed and commented upon at:
- web-platform-tests/wpt#29723
- [ ] Implementation bugs are filed:
- Chrome: β¦
- Firefox: β¦
- Safari: β¦
(See WHATWG Working Mode: Changes for more details.)
I looked at a few other languages to see what their API looks like:
- Java
java.io.DataInput.readFully- If end of file is detected before filling up to
b.length, then this throws anEOFException. In this case, it may be that some but not all bytes ofbhave been updated with data from the input stream.
- If end of file is detected before filling up to
- Rust
std::io::Read::read_exact- If this function encounters an βend of fileβ before completely filling the buffer, it returns an error of the kind
ErrorKind::UnexpectedEof. The contents ofbufare unspecified in this case.
- If this function encounters an βend of fileβ before completely filling the buffer, it returns an error of the kind
- .NET
System.IO.Stream.ReadAsync- Behaves like
ReadableStreamBYOBReader.read(view), in that it doesn't necessarily fill the entire buffer. I couldn't find a direct equivalent that does fill up the entire buffer. EvenBinaryReader.ReadBytes()may read less than the requested number of bytes. BinaryReader.ReadUint32()throws anEndOfStreamExceptionif it detects EOF before reading 4 bytes.
- Behaves like
It looks like most other languages throw an error if they encounter EOF before completely filling the buffer. That said, the Streams standard already takes a different approach for "regular" reads too: instead of returning the number of bytes read into the given buffer, we transfer the given buffer and return a { done, value } pair with a view on that buffer. So it seems appropriate to align closer with our existing read(view) and use the done flag to indicate if we encountered EOF earlier. We can also guarantee that the returned view contains all data up to EOF, rather than leaving the buffer contents unspecified.
If we want to bikeshed a bit on the name, here are some ideas:
readFully(view)readExact(view)
I've been thinking about whether we want to expose pullIntoDescriptor.[[readFully]] to the underlying byte source, e.g. something like ReadableStreamBYOBRequest.isReadFullyRequest.
- Pro: the underlying byte source might be able to optimize for large reads, instead of receiving repeated
pull()calls with continuations of the same BYOB request. For example, forReadableByteStreamTee, we could forward areadFully()request from the branch all the way to the original stream. - Con: we might end up with too much buffering. Again using the
ReadableByteStreamTeeexample: if the first branch doesreadFully(new Uint8Array(1024))and then the second branch doesread(new Uint8Array(1)), the second branch wouldn't receive any data until the underlying source of the original stream has filled up the entire view of the forwardedreadFully()request from the first branch. (This case probably deserves its own WPT test... π€)
So yeah, it might be an optimization opportunity, but it might also be a footgun. At least for now, I'll leave ReadableStreamBYOBRequest untouched. π
Sure, fill(view) also works. Luckily readFully() was a very distinct name, so I could easily find/replace it. π
If we still don't like it, it's pretty easy to change/revert that last commit. Bikeshed away!
I take it that means Chrome is interested in implementing this? (Gotta tick those checkboxes! π)
Yes, you can tick the checkbox for Chrome.
Updated to use read(view, { atLeast }) instead of fill(view), as suggested in #1175. This is a more general solution: users can now choose to only fulfill when the view is filled with at least 1 byte, or 2 bytes, or view.length / 2 bytes, or whatever. With fill(view), it's all or nothing.
We can still introduce fill(view) as a shorthand for read(view, { atLeast: view.length }) if we want to. Happy to discuss. π
Currently, the tests only cover the atLeast == view.length case (i.e. the fill() use case). I still need to write tests for the more general cases, i.e. where 0 < atLeast <= view.length.
Thanks! I am happy not to have fill() since it is not too hard for people to use the more general API.
Deno is interested in implementing this.
A little bikeshedding: how do we feel about the name "atLeast" for the new option? I'm thinking "min" might be better? (Still avoiding "minBytes", because of multi-byte typed arrays.)
const { done, value } = byobReader.read(view, { min: 8 });
// value is minimum 8 elements long
Yes, min is good too. It's a bit more elegant. I don't know whether it's easier to remember or not. minLength is another option if we want to be clear what it's the min of, but that's probably not necessary.
Ready for review. I've added a couple of tests where min < view.length, and code coverage is looking good. π
Do we need another implementer's interest in this? Both Deno and Node.js have already expressed interest, but I believe our current process requires two browser implementers to be onboard.
Yeah. Any thoughts from @saschanaz @youennf?
Gently pinging those involved in this PR.
WebKit supports this addition.
TBH rebasing doesn't help as one cannot use "Show changes since your last review" π₯²
Click on force-pushed. Unfortunately sometimes you need to rebase.
Click on force-pushed.
~~I'm not sure I follow, you mean there's something I can do?~~
Oh okay, force-pushed is a button. Very not noticeable. Anyway I'd prefer merge rather than rebase if it's being reviewed.
What are the blockers for this PR?
@crowlKats I still want to do some refactoring to make byobReader.read(view) desugar to byobReader.read(view, { min: 1 }), see comment. But I didn't find the time for that yet...
I've got some time now, so I'll try to get it done. π
@domenic I think this is ready. Let me know when I can start merging things and filing implementation bugs.
@domenic Apologies for the delay. I've fixed the mod vs. remainder issue, PTAL. π
Looks great! I merged https://github.com/web-platform-tests/wpt/pull/29723, so do the test roll thing, merge, and file implementation bugs when you have a chance!
Oh, and while merging, please change the title to avoid my pet peeve of using Static.method() notation for instance methods :)
Woo! super happy to see this land. Thanks for the work all.