glusterfs-hadoop
glusterfs-hadoop copied to clipboard
Set the default buffer size (if not specified) to 128k
If no configuration value is specified for io.file.buffer.size, set it to 128k.
preapproved +1 , pending addition of a Log.info("Setting buffer size to ...........") statement. otherwise this is really hard to debug at scale in case any issues.
I'd like to see unit tests attached to each pull request. Perhaps, take Jay's pull 55 and merge it with this?
good point steve :) Retracting my +1 until you merge my unit test :) :) https://github.com/gluster/glusterfs-hadoop/pull/65 (or just copy the code - i dont care about the pull request, just copy the code into this commit )
@jayunit100 @wattsteve updated this pull with your requests.
+1
nice one brad +1. real quick before you merge: can you confirm that the unit test passes now ? I've confirmed that it fails on the old shim which uses a 4 or 8K default.
@jayunit100 : that unit test isn't passing. can you fix + submit a change?
hmmmmmmm okay will patch it up . you want me to commit to your childsb/branch ?
hola. The test is failing for all the right reasons: The default buffering is still 4096 bytes. :)... You need to update your "if ..." statement to override the default of 4096, not just to override in case of null/empty. Hadoop will provide a very low 4096 byte default, and that will (Rightly) cause this test to fail.
Well, nevermind. I guess i will patch the test so that it confirms that the io.file.buffer.size parameter is honored and let you folks work out the details of wether or not we should even ALLOW the 4K byte write buffer, or just override it entirely.
//this test will pass, edit (bufferSize) to be larger, once we decide what gluster should allow as a minium.
@Test
public void testBufferSpill() throws Exception {
Path out = new Path("a");
final Integer buffersize = fs.getConf().getInt("io.file.buffer.size", -1);
FSDataOutputStream os = fs.create(out);
int written=0;
/**
* Assert that writes smaller than 10KB are NOT spilled to disk
*/
while(written<buffersize){
os.write("ASDF".getBytes());
written+="ASDF".getBytes().length;
//now, we expect
Assert.assertTrue("asserting that file not written yet...",fs.getLength(out)==0);
}
os.flush();
Assert.assertTrue("asserting that is now written... ",fs.getLength(out)>=buffersize);
os.close();
fs.delete(out);
}