bedtools2 icon indicating copy to clipboard operation
bedtools2 copied to clipboard

Bedtools /dev/stdin behaves differently that just stdin

Open nh13 opened this issue 5 years ago • 3 comments

This

$cat a.bed \
  | bedtools subtract -a /dev/stdin -b b.bed 
  | bedtools subtract -a /dev/stdin -b c.bed \
  > out.bed

yields

***** WARNING: File /dev/stdin has inconsistent naming convention for record:
hr1     <redacted> <redacted>

***** WARNING: File /dev/stdin has inconsistent naming convention for record:
r1      <redacted> <redacted>

and the resulting BED file (out.bed) is missing many records at the start and has a record that starts with hr1 (should be chr1).

No such error occurs with the command below and the output BED file is as expected:

$cat a.bed \
  | bedtools subtract -a stdin -b b.bed 
  | bedtools subtract -a stdin -b c.bed \
  > out.bed
$ bedtools --version
bedtools v2.29.2

nh13 avatar Oct 30 '20 18:10 nh13

I am unable to reproduce this. Could you share your files and let me know what OS and shell?

arq5x avatar Oct 30 '20 18:10 arq5x

If I make a.bed have an oddly named chrom I can get the expected naming error with both approaches.

arq5x avatar Oct 30 '20 18:10 arq5x

I suspect @nh13's input records are all normally-named chroms like chr1 but due to already-read buffers being accidentally dropped they are being misread as the partial suffixes hr1 and r1.

With stdin working (as it is recognised as bedtools's _isStdin special case) but /dev/stdin not working (in 2.29.2) and losing a large buffer's-worth of data at the start (as it is not recognised as _isStdin), I suspect this is another variant of #841 etc. This too has probably been fixed on the current branch by PR #843 but IMHO the robust fix for this sort of thing would be #875's suggestion.

jmarshall avatar Oct 31 '20 22:10 jmarshall