rubix
rubix copied to clipboard
Inconsistent state in cache - Not reading data from cache
Imagine file1 in the local disk already partially(or completely cached)
ls -lrt File1* -rw-rw-rw- 1 yarn yarn 13631488 Feb 8 20:03 File1 -rw-r--r-- 1 yarn yarn 12 Feb 8 20:03 File1_mdfile
2018-02-08T20:04:30.046Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream FileSize of remotePath : File1 is : 94607314
2018-02-08T20:04:30.087Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream FileSize of remotePath : s3a://blah-bhal-path/File1.snappy.parquet is : 94607314
2018-02-08T20:04:30.091Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Seek request, currentPos: 0 currentBlock: 0
2018-02-08T20:04:30.091Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Seek to 94607306, setting block location 90
2018-02-08T20:04:30.091Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Got Read, currentPos: 94607306 currentBlock: 90 bufferOffset: 0 length: 1048576
2018-02-08T20:04:30.091Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Iterating from startBlock : 90 to End block : 92
2018-02-08T20:04:30.091Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Sending cached block 90 to cachedReadRequestChain
2018-02-08T20:04:30.091Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.ReadRequestChain Request to add ReadRequest: [94371840, 94607314, 94607306, 94607314, 0]
2018-02-08T20:04:30.091Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.ReadRequestChain Added ReadRequest: [94371840, 94607314, 94607306, 94607314, 0]
2018-02-08T20:04:30.092Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Reached EOF, returning
2018-02-08T20:04:30.092Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Executing Chains
2018-02-08T20:04:30.092Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain Processing readrequest 94607306-94607314, length 8
2018-02-08T20:04:30.092Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain CachedFileRead copied data [94607306 - 94607306] at buffer offset 0
2018-02-08T20:04:30.092Z INFO 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain Read 0 bytes from cached file
2018-02-08T20:04:30.092Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Read 0 bytes
2018-02-08T20:04:30.092Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Got Read, currentPos: 94607306 currentBlock: 90 bufferOffset: 0 length: 1048576
2018-02-08T20:04:30.092Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Iterating from startBlock : 90 to End block : 92
2018-02-08T20:04:30.093Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Sending cached block 90 to cachedReadRequestChain
2018-02-08T20:04:30.093Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.ReadRequestChain Request to add ReadRequest: [94371840, 94607314, 94607306, 94607314, 0]
2018-02-08T20:04:30.093Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.ReadRequestChain Added ReadRequest: [94371840, 94607314, 94607306, 94607314, 0]
2018-02-08T20:04:30.093Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Reached EOF, returning
2018-02-08T20:04:30.093Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Executing Chains
2018-02-08T20:04:30.093Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain Processing readrequest 94607306-94607314, length 8
2018-02-08T20:04:30.093Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain CachedFileRead copied data [94607306 - 94607306] at buffer offset 0
2018-02-08T20:04:30.093Z INFO 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain Read 0 bytes from cached file
2018-02-08T20:04:30.094Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Read 0 bytes
2018-02-08T20:04:30.094Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Got Read, currentPos: 94607306 currentBlock: 90 bufferOffset: 0 length: 1048576
2018-02-08T20:04:30.094Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Iterating from startBlock : 90 to End block : 92
2018-02-08T20:04:30.094Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Sending cached block 90 to cachedReadRequestChain
2018-02-08T20:04:30.094Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.ReadRequestChain Request to add ReadRequest: [94371840, 94607314, 94607306, 94607314, 0]
2018-02-08T20:04:30.094Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.ReadRequestChain Added ReadRequest: [94371840, 94607314, 94607306, 94607314, 0]
2018-02-08T20:04:30.094Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Reached EOF, returning
2018-02-08T20:04:30.094Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Executing Chains
2018-02-08T20:04:30.095Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain Processing readrequest 94607306-94607314, length 8
2018-02-08T20:04:30.095Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain CachedFileRead copied data [94607306 - 94607306] at buffer offset 0
2018-02-08T20:04:30.095Z INFO 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain Read 0 bytes from cached file
2018-02-08T20:04:30.095Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Read 0 bytes
2018-02-08T20:04:30.095Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Got Read, currentPos: 94607306 currentBlock: 90 bufferOffset: 0 length: 1048576
2018-02-08T20:04:30.095Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Iterating from startBlock : 90 to End block : 92
2018-02-08T20:04:30.095Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Sending cached block 90 to cachedReadRequestChain
2018-02-08T20:04:30.095Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.ReadRequestChain Request to add ReadRequest: [94371840, 94607314, 94607306, 94607314, 0]
2018-02-08T20:04:30.096Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.ReadRequestChain Added ReadRequest: [94371840, 94607314, 94607306, 94607314, 0]
2018-02-08T20:04:30.096Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Reached EOF, returning
2018-02-08T20:04:30.096Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Executing Chains
2018-02-08T20:04:30.096Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain Processing readrequest 94607306-94607314, length 8
2018-02-08T20:04:30.096Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain CachedFileRead copied data [94607306 - 94607306] at buffer offset 0
2018-02-08T20:04:30.096Z INFO 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachedReadRequestChain Read 0 bytes from cached file
2018-02-08T20:04:30.096Z DEBUG 20180208_200429_00310_rztrq.11.2-36-93 com.qubole.rubix.core.CachingInputStream Read 0 bytes
As we can bookkeeper assumes the data is cached (from 94607306 to 94607314) tries to read from the cache but there is nothing in the file. So it will fill nothing in the input buffer.
In the bookeeper log we are getting this:
18/02/08 22:23:44,854 WARN pool-3-thread-10537 bookkeeper.BookKeeper: Could not cache data: org.apache.thrift.shaded.TException: java.io.IOException: No such file or directory
at com.qubole.rubix.bookkeeper.BookKeeper.getCacheStatus(BookKeeper.java:154)
at com.qubole.rubix.bookkeeper.BookKeeper.readData(BookKeeper.java:275)
at com.qubole.rubix.spi.BookKeeperService$Processor$readData.getResult(BookKeeperService.java:456)
at com.qubole.rubix.spi.BookKeeperService$Processor$readData.getResult(BookKeeperService.java:441)
at org.apache.thrift.shaded.ProcessFunction.process(ProcessFunction.java:39)
at org.apache.thrift.shaded.TBaseProcessor.process(TBaseProcessor.java:39)
at org.apache.thrift.shaded.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: No such file or directory
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createNewFile(File.java:1006)
at com.qubole.rubix.bookkeeper.FileMetadata.refreshBitmap(FileMetadata.java:93)
at com.qubole.rubix.bookkeeper.FileMetadata.isBlockCached(FileMetadata.java:112)
at com.qubole.rubix.bookkeeper.BookKeeper.getCacheStatus(BookKeeper.java:138)
... 9 more
Isnt the problem more about finding why we wrote nothing but still set the cached status to true for these blocks? Do we have logs from RemoteReadRequestChain caching this data and updating cache status?