hdfs
hdfs copied to clipboard
Empty file after CopyToRemote
from #221
I found the file in hdfs is empty, but there is no error log. This problem has troubled me for a long time.
Same issue with hdfs file being "empty", after calling CopyToRemote
api which reports no error.
The content of the file is complete, same as the local file if we hdfs get
or hdfs cat
it. The file is actually not empty but being under construction.
$ hdfs dfs -checksum /path/to/empty/file
checksum: Fail to get checksum, since file /path/to/empty/file is under construction.
The workaround is to recover file lease manually.
$ hdfs debug recoverLease -path /path/to/empty/file
recoverLease SUCCEEDED on /path/to/empty/file
Unfortunately no stable reproduce method found yet, the guess is that remote.Close
have missed some errors.
https://github.com/colinmarc/hdfs/blob/d5784c387f661b5c0044383f0095235fd53c06bd/client.go#L289-L307
I hava encoutered the same problem. In my case a call to the copyToRemote returns the error: read tcp 172.16.202.36:49256->172.18.122.161:50010: read: connection reset by peer this error seems to happen only with hdfs clusters with kerberos turned on.
After checking the logs for hdfs namenode, I found this log snippet