PARQUET-1776: NIO wrapper for Output/Input File
Make sure you have checked all steps below.
Jira
- [x] My PR addresses the following Parquet Jira issues
Tests
- [ ] My PR adds the following unit tests OR does not need testing for this extremely good reason:
Commits
- [ ] My commits all reference Jira issues in their subject lines. In addition, my commits follow the guidelines from "How to write a good git commit message":
- Subject is separated from body by a blank line
- Subject is limited to 50 characters (not including Jira issue reference)
- Subject does not end with a period
- Subject uses the imperative mood ("add", not "adding")
- Body wraps at 72 characters
- Body explains "what" and "why", not "how"
Documentation
- [ ] In case of new functionality, my PR adds documentation that describes how to use it.
- All the public functions and the classes in the PR contain Javadoc that explain what it does
@HunterL
OK, so I've been recently diving into Java NIO a bit and let me add some thoughts that relate to this task:
Please change the naming convention and drop the "Local" prefix. I think Nio is probably best. As an example of why this matters:
// This exists in the current PR
return new LocalSeekableInputStream(Files.newInputStream(path));
Well that statement Files.newInputStream(path) will provide a new InputStream, but the InputStream is totally driven by whatever the path points at. Things within the JVM could be setup to allow the provided path to point at an FTP location for example and that would be a remote read and not a local read.
I also think you will have better luck storing, internal to the Input/Output streams a SeekableByteChannel instead of an OutputStream or an InputStream. This will allow you to easily track the position aspect of the underlying stream without having to do it yourself. For example, the current OutputStream implementation does not currently implement this feature:
@Override
public long getPos() throws IOException {
return 0;
}
Thanks.
Also requires unit tests.
Here are a couple of starters:
private final ByteBuffer oneByteBuffer = ByteBuffer.allocate(1);
@Override
public int read() throws IOException {
int read = this.channel.read(this.oneByteBuffer.rewind());
if (read < 0) {
return read;
}
return (0xFF & this.oneByteBuffer.flip().get());
}
@Override
public void readFully(byte[] bytes) throws IOException {
ByteBuffer bb = ByteBuffer.allocate(bytes.length);
while (bb.hasRemaining()) {
int read = this.channel.read(bb);
if (read == -1) {
throw new EOFException();
}
}
bb.flip().get(bytes);
}