logstash-input-file icon indicating copy to clipboard operation
logstash-input-file copied to clipboard

Meta Issue to track outstanding changes to the file input

Open guyboertje opened this issue 6 years ago • 7 comments

Items on this list may be covered by one or more existing issues, I will circle back and update the list when I spot the issue.

  • [ ] Add "feeder" support to DIscoverer, for when the discovered file set is so large that file reading is delayed significantly.
  • [ ] Allow the file path to be used as the sincedb key. Used as an alternative to inodes when the file path is known to be a unique identifier (more strongly than the inode combo anyway) e.g. /dir/<content_type>-1536495948-769372.log (NFS).
  • [ ] Handle errors other than ENOENT that effectively mean the same thing - the file has gone (NFS).
  • [ ] Allow read mode to detect files still growing, e.g. being "filled" over a slow remote link (NFS).
  • [ ] Allow read mode to detect file rotation.
  • [ ] Align clean_sincedb_after, close_older and ignore_older with Filebeat's settings, adding new ones as needed.
  • [ ] Add "move" file completed action and a file_completed_move_path setting.
  • [ ] Remove sincedb entry on successful deletion of the file, if using inodes as identifier only.
  • [ ] Add delete_after to read mode to delay the delete (might be covered by syncing with filebeat ^)
  • [ ] Add support to bypass the delimiter boundary detection - send chunks instead of lines, needs label on the codec to say it does boundary detection.
  • [ ] Add support to send the byte offset of the chunk or line and file size (read mode) - needs additional work at the codec level.
  • [ ] Log at WARN when number of files discovered is zero.
  • [ ] Add support to detect when a filesystem has been unmounted err: Errno::EAGAIN and spin until files are seen again.
  • [ ] In register, change ArgumentError to ConfigurationError.
  • [ ] Check file permissions in Discoverer for each file.
  • [ ] Rotated files should always start from the beginning (i.e. ignore start_position)
  • [ ] Option to exit pipeline when read mode has reached the end of the file (https://github.com/logstash-plugins/logstash-input-file/issues/212).

guyboertje avatar Sep 10 '18 18:09 guyboertje

Its been more than a few months for those listed above , we are looking at a few of the things to be used in our ls pipelines, can we know if any of these are. being worked upon?

ganeshsi avatar Jun 26 '19 16:06 ganeshsi

@ganeshsi Sorry, I'm not working on the file input at present. The LS team at Elastic is small, 6 (7 soon) people and we are spread across the main LS repos and about 80-100 plugins.

guyboertje avatar Jun 27 '19 09:06 guyboertje

@guyboertje thanks for the response !

ganeshsi avatar Jun 27 '19 10:06 ganeshsi

Out of interest, which changes do you (or any other reader) consider most needed? It would help if you could rank them.

guyboertje avatar Jun 27 '19 10:06 guyboertje

Two of these infact

  • Add "feeder" support to DIscoverer, for when the discovered file set is so large that file reading is delayed significantly.
  • And #219
    We've worked around with an internal change for latter by adding a "no" option for the sort bits We use logstash to manage filesystem sources having significant number of files

ganeshsi avatar Jun 27 '19 10:06 ganeshsi

@guyboertje Most needed by far is: Option to exit pipeline when read mode has reached the end of the file (#212).

jbwl avatar Jul 19 '19 14:07 jbwl

I am interested in support to send the byte offset of the chunk or line and file size (read mode) so I don't have to use filebeat in the equation.

MikeSaveItiviti avatar May 13 '20 14:05 MikeSaveItiviti