kmermaid
kmermaid copied to clipboard
Add option to make per-cell bam files
Currently, if one wants to count reads with differential hashes, in genes, one needs to grep/search the ENTIRE 22-gigabyte channel bam file for one single cell (out of ~700,000), which is extremely inefficient. So let's do this work up fron. After filtering for the good barcodes, then add the option to create per-cell bam files which are useful for nf-predictorthologs.
script:
barcode_pattern = "CB:Z:${cell_barcode}-1|XC:Z:${cell_barcode}"
"""
samtools view ${channel_bam} \\
| rg --threads ${task.cpus} '${barcode_pattern}' - \\
| cat ${header_sam} - \\
| samtools view -Sb > ${cell_barcode_bam}
"""
@lekhakaranam may be a good feature to add after the template merge (#93 )
looks like this PR was opened but closed a while ago - https://github.com/nf-core/kmermaid/pull/97
Oh yeah I think there were some merge/rebase issues