kafka-connect-hdfs
kafka-connect-hdfs copied to clipboard
HDFS writer can't be restarted correctly after the stopping of a datanode
sometimes the HDFS writer can't be restarted correctly when a trouble occurs into the datalake (after the stopping of a datanode) .HDFS logs may be corrupted, already openforwrite and the exception below may appears into the writer pool threads which is getting into a blocked state.
2019/02/05 10:19:53 ERROR 758286 [pool-6-thread-1] hdfs.wal.FSWAL.apply(131): Error applying WAL file: [hdfs://namenodeHA/apps/hdfs-writer/.../logs/.../6/log,] java.io.EOFException
What is your replication factor? If you just have replication factor of 1 and you stop a datanode, I would expect data loss and strangeness to occur. HDFS breaks down each file into a sequence of blocks and then replicates the blocks to different datanodes for fault tolerance. The "data replication" section here has a really good explanation of how this works.
Datanodes don't prevent writes.
All writes first communicate with the namenode. If you had more than one replica, then the namenode would pick another datanode in the cluster