htslib icon indicating copy to clipboard operation
htslib copied to clipboard

Provide more informative error message when encountering unexpected n…

Open cinquin opened this issue 8 years ago • 2 comments

…ewline character in SAM header.

cinquin avatar Apr 22 '18 15:04 cinquin

This could print lots of lines if the problem happens at the start of a long header. There's also no way of limiting the output if the header doesn't include a terminating NUL character, which could be a security problem.

It would be best to keep track of the start of the last line so that the dump always begins at a sensible place. Then use memchr() to find out where the current line ends and use an output format of the form "%.*s", len, last_start to restrict the amount of data that gets printed. Or even better might be an extra function that prints a few lines of header with some extra checks to ensure that the output characters are actually printable, as outputting binary nonsense is generally not a good idea.

daviesrob avatar Apr 23 '18 09:04 daviesrob

It's also incorrect to say "unexpected newline" because this error will be triggered on a header that has zero newlines (and also doesn't start with an @).

Not that we can (legally) create such things, but the whole point of this is to spot illegal / corrupted data.

jkbonfield avatar Apr 23 '18 10:04 jkbonfield