kafka-connect-hdfs
kafka-connect-hdfs copied to clipboard
RAM_DISK / LAZY_PERSIST used in kafka connect for commiting files to hdfs is not officially supported by hdfs
Hi team, we raised an issue related to RAM_DISK / LAZY_PERSIST in which after successfull commit of files also , file was not present in hdfs cluster after rename. we checked the namenode logs for that file , in namenode logs there was no error and file was committed still at destination folder file was missing .
In response of this issue from hdfs we get to know that both Cloudera and Hortonworks don't support this feature officially. can you guys comment on this.
below are the complete logs from beginning of creating temp file , then closing it and persisting to filesystem and at last renaming it.
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR NameNode.create: file /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet for DFSClient_NONMAPREDUCE_2059097589_230 at ...
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* NameSystem.startFile: src=/topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet, holder=DFSClient_NONMAPREDUCE_2059097589_230, clientMachine=..., createParent=true, replication=1, createFlag=[CREATE, OVERWRITE], blockSize=268435456, supportedVersions=[CryptoProtocolVersion{description='Encryption zones', version=2, unknownValue=null}]
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* addFile: 351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet is added
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* NameSystem.startFile: added /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet inode 159332 holder DFSClient_NONMAPREDUCE_2059097589_230
19/10/31 16:54:09 DEBUG top.TopAuditLogger: ------------------- logged event for top service: allowed=true ugi=root (auth:SIMPLE) ip=/... cmd=create src=/topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet dst=null perm=root:supergroup:rw-r-r-
19/10/31 16:54:09 DEBUG hdfs.StateChange: BLOCK NameNode.addBlock: file /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet fileId=159332 for DFSClient_NONMAPREDUCE_2059097589_230
19/10/31 16:54:09 DEBUG hdfs.StateChange: BLOCK* getAdditionalBlock: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet inodeId 159332 for DFSClient_NONMAPREDUCE_2059097589_230
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* FSDirectory.addBlock: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet with blk_1073840307_125096 block is added to the in-memory file system
19/10/31 16:54:09 INFO hdfs.StateChange: BLOCK* allocate blk_1073840307_125096{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-680efedf-fb3a-4f64-bd50-de89a55b377f:NORMAL:...:50010|RBW]]} for /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet
19/10/31 16:54:09 DEBUG hdfs.StateChange: persistNewBlock: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet with new block blk_1073840307_125096{UCState=UNDER_CONSTRUCTION, truncateBlock=null, primaryNodeIndex=-1, replicas=[ReplicaUC[[DISK]DS-680efedf-fb3a-4f64-bd50-de89a55b377f:NORMAL:...:50010|RBW]]}, current total block count is 1
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR NameNode.complete: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet fileId=159332 for DFSClient_NONMAPREDUCE_2059097589_230
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* NameSystem.completeFile: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet for DFSClient_NONMAPREDUCE_2059097589_230
DEBUG hdfs.StateChange: closeFile: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet with 1 blocks is persisted to the file system
19/10/31 16:54:09 INFO hdfs.StateChange: DIR* completeFile: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet is closed by DFSClient_NONMAPREDUCE_2059097589_230
19/10/31 16:54:09 DEBUG top.TopAuditLogger: ------------------- logged event for top service: allowed=true ugi=root (auth:SIMPLE) ip=/... cmd=getfileinfo src=/topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet dst=null perm=null
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR NameNode.rename: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet to /topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+0000000100+0000000199.parquet
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* NameSystem.renameTo: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet to /topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+0000000100+0000000199.parquet
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* FSDirectory.renameTo: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet to /topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+0000000100+0000000199.parquet /351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet
19/10/31 16:54:09 DEBUG hdfs.StateChange: DIR* FSDirectory.unprotectedRenameTo: /topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet is renamed to /topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+0000000100+0000000199.parquet
19/10/31 16:54:09 DEBUG namenode.LeaseManager: LeaseManager.changelease: src=/topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet, dest=/topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+0000000100+0000000199.parquet
19/10/31 16:54:09 DEBUG namenode.LeaseManager: LeaseManager.findLease: prefix=/topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet
19/10/31 16:54:09 DEBUG top.TopAuditLogger: ------------------- logged event for top service: allowed=true ugi=root (auth:SIMPLE) ip=/... cmd=rename src=/topics/+tmp/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/351bffa9-15e3-427b-9e02-c9e8823d68d6_tmp.parquet dst=/topics/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic/tenant=5da59e664cedfd00090d3757/groupid=5da59e664cedfd00090d3758/project=5da59e664cedfd00090d3759/name=dataPipeLineEvent_17/year=2019/month=10/day=16/hour=17/datapipelinefinaltest14.5da59e664cedfd00090d3757.dataPipeLineEvent_17.topic+9+0000000100+0000000199.parquet perm=root:supergroup:rw-r-r- 19/10/31 16:54:09 DEBUG security.UserGroupInformation: PrivilegedAction as:root (auth:SIMPLE) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
Do you have anything special in your hdfs-site.xml that would cause the tmp file location to use lazy persist writes? Connector supports hadoop.conf.dir to set configuration location and will use what it gets from the specified config directory.
I would not suggest using this setup for the tmp file due to lazy persist writes and the associated best effort persistence guarantees.
Here is some more info on the feature from Hadoop.
@ncliang sorry but I am not able to understand what you mean by "anything special in your hdfs-site.xml"
our hdfs-site.xml have this configuration:
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///data1/dataNode</value>
</property>
<property>
In above comment i tried to explain the flow how connector first create a tmp file inmemory of hdfs and then after rotate time or flush size is reached commit/persist the file to hdfs disk.(Is this called lazy persist ?) and our query is when file is successfully committed then how it can simply not present in destination directory.
@abhisheksahani ah, ok. When I read the title I was under the impression that you had explicitly configured the Hadoop deployment to use in-memory storage and lazy persists for the tmp directory. This could be done through hdfs-site.xml or setting storage policy or when creating the file with client. Since I'm sure we don't pass CreateFlag.LAZY_PERSIST when creating the file, I was asking how you had ended up with this config.
It seems like your question is not about LAZY_PERSIST, but about a committed file not being observed at the destination directory. So, a few questions here: How are you observing the destination directory? Does the change eventually show up?
HDFS is an eventually consistent system that provides weak isolation guarantees. Could it just be that you are viewing from another client and that the changes have not been propagated yet? If this is true, then it should not disrupt Connector operation and is just an artifact of how HDFS is designed.
@ncliang We checked the files from different hdfs client next day also but files were missing.
I'll echo the statement that the connector is not the problem. It's your hdfs-site.xml config that would be causing said issues.
If you are using Apache Ambari or Cloudera Manager to manage your Hadoop cluster, you would use it to provision a complete XML client config bundle, which you can then extract into a Kafka Connect node