logstash-filter-csv
logstash-filter-csv copied to clipboard
Blank line at start of file messes up autodetect_column_names. skip_empty_rows does not fix
- Version: 7.1.1
- Operating System: AWS Linux
- Config File (if you have sensitive info, please remove it):
input {
generator { count => 1 lines => [ '' ] }
file { path => "/user/foo.csv" sincedb_path => "/dev/null" start_position => beginning }
}
filter { csv { autodetect_column_names => true skip_empty_rows => true } }
output { stdout { codec => rubydebug { metadata => false } } }
- Sample Data:
input foo.csv containing this, or any other valid csv
a,b
1,2
- Steps to Reproduce:
Just run the above configuration with the above data. It results in
[2019-06-11T01:14:06,397][WARN ][logstash.filters.csv ] Error parsing csv {:field=>"message", :source=>"a,b", :exception=>#<NoMethodError: undefined method `empty?' for nil:NilClass>}
[2019-06-11T01:14:06,405][WARN ][logstash.filters.csv ] Error parsing csv {:field=>"message", :source=>"1,2", :exception=>#<NoMethodError: undefined method `empty?' for nil:NilClass>}
Since the input generator gets consumed generating an empty set of columns the header of the csv appears in the rubydebug, which does not provide a strong hint as to the problem.
Moving the skip_empty_rows test above the autodetect_column_names would improve things, although that's still not a very good UX, since it requires the user to exactly understand the problem