kafka-connect-hdfs HDFS writer can't be restarted correctly after the stopping of a datanode

HDFS writer can't be restarted correctly after the stopping of a datanode

Open swathimocharla opened this issue 6 years ago • 2 comments

sometimes the HDFS writer can't be restarted correctly when a trouble occurs into the datalake (after the stopping of a datanode) .HDFS logs may be corrupted, already openforwrite and the exception below may appears into the writer pool threads which is getting into a blocked state.

2019/02/05 10:19:53 ERROR 758286 [pool-6-thread-1] hdfs.wal.FSWAL.apply(131): Error applying WAL file: [hdfs://namenodeHA/apps/hdfs-writer/.../logs/.../6/log,] java.io.EOFException

May 02 '19 11:05 swathimocharla

What is your replication factor? If you just have replication factor of 1 and you stop a datanode, I would expect data loss and strangeness to occur. HDFS breaks down each file into a sequence of blocks and then replicates the blocks to different datanodes for fault tolerance. The "data replication" section here has a really good explanation of how this works.

Oct 04 '19 05:10 ncliang

Datanodes don't prevent writes.

All writes first communicate with the namenode. If you had more than one replica, then the namenode would pick another datanode in the cluster

May 02 '20 01:05 OneCricketeer

kafka-connect-hdfs kafka-connect-hdfs copied to clipboard

HDFS writer can't be restarted correctly after the stopping of a datanode

kafka-connect-hdfs
kafka-connect-hdfs copied to clipboard