hdfs icon indicating copy to clipboard operation
hdfs copied to clipboard

Empty file after CopyToRemote

Open NickYadance opened this issue 1 year ago • 1 comments

from #221

I found the file in hdfs is empty, but there is no error log. This problem has troubled me for a long time.

Same issue with hdfs file being "empty", after calling CopyToRemote api which reports no error.

The content of the file is complete, same as the local file if we hdfs get or hdfs cat it. The file is actually not empty but being under construction.

$ hdfs dfs -checksum /path/to/empty/file
checksum: Fail to get checksum, since file /path/to/empty/file is under construction.

The workaround is to recover file lease manually.

$ hdfs debug recoverLease -path /path/to/empty/file
recoverLease SUCCEEDED on /path/to/empty/file

Unfortunately no stable reproduce method found yet, the guess is that remote.Close have missed some errors. https://github.com/colinmarc/hdfs/blob/d5784c387f661b5c0044383f0095235fd53c06bd/client.go#L289-L307

NickYadance avatar Jul 28 '23 10:07 NickYadance

I hava encoutered the same problem. In my case a call to the copyToRemote returns the error: read tcp 172.16.202.36:49256->172.18.122.161:50010: read: connection reset by peer this error seems to happen only with hdfs clusters with kerberos turned on.

After checking the logs for hdfs namenode, I found this log snippet 企业微信截图_9b8e6c32-f2b7-4036-a1da-4b01551cac21

4LL3N51147 avatar May 29 '24 08:05 4LL3N51147