htslib icon indicating copy to clipboard operation
htslib copied to clipboard

vcf/bcf reading has no concept of errors!

Open jkbonfield opened this issue 10 years ago • 9 comments

zlib can spot truncated files, as can bgzf, but the code for bcf_read glosses over all such things and treats any error as EOF.

Note this isn't the same issues as https://github.com/samtools/samtools/issues/101 which is describing the samtools application as ignoring errors returned by htslib. This is htslib not being able to return the error in the first place.

It can be demonstrated in bcftools or htsfile, but the error comes from bcf_read returning 0 in many cases.

$ samtools mpileup -g ../samtools/test/mpileup/mpileup.1.bam > /tmp/_.bcf
$ head -50000c /tmp/_.bcf | ./htsfile -c - > /dev/null;echo $?
ret=0
0
$ head -50000c /tmp/_.bcf | zcat > /dev/null;echo $?

gzip: stdin: unexpected end of file
1

The ret=0 is debugging I put into htsfile to show the return value from the last bcf_read() call.

jkbonfield avatar Jul 21 '15 15:07 jkbonfield

Need to check whether we can have an error return value and whether that involves breaking the ABI.

jmarshall avatar Jan 25 '16 16:01 jmarshall

Perhaps we can set fp->fp.bgzf->errcode (or maybe it's already set?) as a way of distinguishing EOF from errors. This is a side-channel that would leave the ABI the same.

Edit: Indeed as far as htsfile goes, it can just check in->fp.bgzf->errcode != 0. In the above case it is 5 (perhaps due to additional checks subsequently applied as I now get a warning where I previously had none). That's ugly as hell though - I'd like to see vcf_errcode or similar wrapping up, even if it's just a macro.

Comments?

jkbonfield avatar Apr 26 '16 13:04 jkbonfield

That sounds like a good idea.

Re ugliness, IMHO anything that involves non-htslib code directly looking inside a htsFile * is a bug. See comments on the htsFile declaration in hts.h. So yes, we'd want to add some kind of accessor function/macro.

jmarshall avatar Apr 26 '16 14:04 jmarshall

Do these comments in hts.h have any relation to wanting ABI changes for 1.4? I suspect not as deprecating the old fields used <= 1.1 won't really help us unless we can remove every field which I don't believe we can.

Regarding this specific issue, while we could add checks to in->fp.bgzf->errcode with an appropriate accessor function at the htsFile level, I wonder if it is sufficient or whether we should take the opportunity to add an errno field to the htsFile structure now too (even if it's not used in the initial implementation).

Obviously bgzf has errcode, cram_fd has an err field, hFILE has an has_errno field - is that sufficient to avoid needing a generic hts errno field?

jkbonfield avatar Jun 06 '16 15:06 jkbonfield

I'm having a look at this now, as we want to get as many ABI breaking things done before next release.

At first glance the bug as reported no longer happens due to improvements in bgzf checking (not least the addition of CRC checks, but even without that it errors), but there are still many void functions in vcf.c so the issue still stands in some form or other. (Edit: the initial bug was fixed by cc6ca5212157eb43232d680f8ee8dd5c1802ba33)

jkbonfield avatar May 29 '18 08:05 jkbonfield

Well, I started down this road and I think it may well be a rather large exercise in futility! What I have so far (work in progress, be warned) is over at https://github.com/jkbonfield/htslib/tree/vcf_checks

However, it's basically dominated by checking for malloc, calloc, realloc, strdup, etc memory failures. Such things are fine of course, but this is just the tip of a very large iceberg. There's all the kstring code (kputs etc), hashing (kh_put), plus the nightmare of all the hts_expand macros for growing arrays. That latter is a lot of recoding infact.

It's easy to pour scorn on the lack of error checking, but actually the same also applies to all the SAM and BAM parsing code. We almost never check that kh_put works, nor kputs etc. This doesn't mean it's not worth doing, but we have to balance up the risk of introducing bugs with a huge amount of code changes vs the benefit. In this case the benefit is we turn a core dump into an error message as realistically no problem is going to handle running out of memory and continue. Am I missing something?

jkbonfield avatar Jun 01 '18 11:06 jkbonfield

@jkbonfield I looked at bcf_read() tonight as I ran into this same issue. This seems to have been partially, but not completely, corrected (although I haven't looked at the diff compared to 2015 when this issue was first opened)

The docstring for bcf_read indicates:

     *  Returns -1 on critical errors, 0 otherwise. On errors which are not
     *  critical for reading, such as missing header definitions, v->errcode is
     *  set to one of BCF_ERR* code and must be checked before calling
     *  vcf_write().

However, review of the function shows that it will return -2 for truly critical errors: https://github.com/samtools/htslib/blob/a80f5fd602eb375dd4a9068ad53601a3fa51a4bc/vcf.c#L1161-L1184

-1 is returned for EOF as expected.

However, the -1 value is also used for error conditions setting v->errcode as suggested in docstring; this is generated in bcf_record_check: https://github.com/samtools/htslib/blob/a80f5fd602eb375dd4a9068ad53601a3fa51a4bc/vcf.c#L1242

Changing bcf_record_check to return -2 if (err) would bring bcf_read in line with other htslib functions, for example, sam_read1, which yield -1 on EOF and < -1 on error.

*  @return >= 0 on successfully reading a new record, -1 on end of stream, < -1 on error

jblachly avatar Jan 24 '19 04:01 jblachly

@jblachly Good spot. Yes, it should return -2 when bcf_record_check() fails. I will adjust it so that it behaves as intended.

daviesrob avatar Jan 24 '19 09:01 daviesrob

Commit 9845bc9 makes bcf_record_check() return -2 on error, so problems should now be more obvious.

daviesrob avatar Jan 25 '19 11:01 daviesrob