blobtools
blobtools copied to clipboard
Tip: some jq code to get list of "good" contigs
Hello,
This is mostly a PSA, as the following took me way to long to work out myself. Perhaps the authors could add this to the docs somewhere appropriate.
To filter a set of contigs based on the GC content and coverage (a la the blobplot), one can use the following jq command:
jq -r '.dict_of_blobs[] | select((.covs.bam0 > 10) and (.gc > 0.4)) | .name' \
< path/to/something.blobDB.json \
> goodcontigs.txt
Here, I use a coverage threshold of 10 in the first bam, and a minmum GC of 0.4. Obviously adjust these thresholds to your blobplot. Additional bams would be supported by adding something like (.covs.bam1 > 23) and within the select() function. The resulting goodcontigs.txt is a simple text list of contig names compatible with blobtools seqfilter.
Thanks for a great tool, K