intelmq icon indicating copy to clipboard operation
intelmq copied to clipboard

CSV line recovery forces Windows line endings

Open sinus-x opened this issue 5 years ago • 1 comments

The recover_line_csv() function is joining data with \r\n, redgardless of the source file newlines: https://github.com/certtools/intelmq/blob/a2d20df6fd4fa0386fe79e66156537336faf92b0/intelmq/lib/bot.py#L1042

This means that bot test are failing even though the parsing succeeded, because the raw values do not match: there is a new CR byte. As a workaround, bot tests are replacing newlines in their test files to account for that, which is not ideal solution. https://github.com/certtools/intelmq/blob/f7a5c7adb9e6a3b8a65c6e453195fd914fe80c20/intelmq/tests/bots/parsers/generic/test_parser_csv_data_type.py#L20

sinus-x avatar Aug 14 '20 11:08 sinus-x

Yes, ideally the use line ending is detected in ParserBot.parse_csv and ParserBot.parse_csv_dict and then saved in the class or somewhere else.

ghost avatar Aug 19 '20 16:08 ghost