kafka-connect-hdfs
kafka-connect-hdfs copied to clipboard
Question about temp files
trafficstars
When HDFS Sink connector start buffering records, it writes a temp file at +/tmp/
Is HDFS Sink connector writing this temp file every time it recevies a message (I mean append to the file)? Or it is keeping the temp file into JVM memory and only when it reaches flush size, it will write the temp file with all records accumulated by flush size?
Thank you in advance
Hi @dinegri records are appended to temp files as they arrive, not kept in memory.
The temp files are then moved to the final path if one or more of these are true:
flush.sizeamount of records have been reached in the temp filerotate.interval.mswas reachedrotate.schedule.interval.mswas reached- record schema was changed
You can find more information on these in the Confluent documentation.