James Bonfield

Results 409 comments of James Bonfield

For reference, I found the picard query name comparison here: http://grepcode.com/file/repo1.maven.org/maven2/org.utgenome.thirdparty/picard/1.102.0/net/sf/samtools/SAMRecordQueryNameComparator.java?av=f In short it's first by name, then by fwd/rev status, then if they still match by complemented flag, then...

Something like this is an **approximation** of the picard method. It tries to reorder the flag bits in the order of READ1, READ2, COMPLEMENTED, SECONDARY, SUPPLEMENTARY, everything-else. Needs more checking......

The SAM specification states that the first column is the query name. Anything in there that is NOT the query name is a bug in whatever produced the data, so...

Maybe I'm being dense, but I don't see how your sort code above achieves what you want it to achieve. Given 1 primary alignment and 1 supplementary alignment, you are...

On Tue, Mar 15, 2016 at 02:08:02AM -0700, Bo Li wrote: > Hi @jkbonfield , the reality for not only RNA-Seq but all aligner produced outputs are: the two mates...

A name-only sort does at least mean you can perform any secondary sort term manually and then re-sort by name to bring reads together via a second field. I assume...

Maybe not what you'd like, but something like this perhaps is a workaround: ``` (samtools view -H in.bam; samtools view -@4 in.bam | sort -S 1G --parallel=4 -k 1,1 -k...

I'm not at all familier with GCS, but it occurs maybe this is an issue of retaining a file handle open for too long without interim usage. There is a...

There's also `samtools bedcov` which is explicitly designed for that scenario, but the output format is a little different. Does that suffice?

How large is the SAM header? Or maybe easier to answer, how many contigs are there in your contigs.fa file? I'm wondering if it's gone beyond the 2GB size limit....