CoverM
CoverM copied to clipboard
Usage of --sharded
Hi, thanks again for the development of this tool. I had a quick question in regards to the use of the --sharded
command. I believe my reference fits the use case, where a single index would not fit into memory. My expectation was that if I pass multiple references to the -r
option, the reads would be mapped to both references, then the best match between the two would be selected. So my current command is:
REF_FILE1=02_map_binned/side_test/bwa_idx/part1
REF_FILE2=02_map_binned/side_test/bwa_idx/part2
BAM_DIR=02_bam_files
TMPDIR=tmp_dir coverm contig -m mean -r $REF_FILE1 $REF_FILE2 \
--output-file test_shard.tsv -p bwa-mem2 --sharded \
--min-read-percent-identity 0.95 --min-read-aligned-percent 0.95 -t 24 \
--bam-file-cache-directory $BAM_DIR --no-zeros --single unbinned_nr_genes_00[345].ffn.gz
For which, I get the output:
[2023-06-18T04:20:00Z INFO bird_tool_utils::clap_utils] CoverM version 0.6.1
[2023-06-18T04:20:00Z INFO coverm] Using min-read-percent-identity 95%
[2023-06-18T04:20:00Z INFO coverm] Using min-read-aligned-percent 95%
[2023-06-18T04:20:00Z INFO coverm] Writing output to file: test_shard.tsv
[2023-06-18T04:20:00Z INFO coverm] Using min-covered-fraction 0%
[2023-06-18T04:20:01Z INFO bird_tool_utils::external_command_checker] Found bwa-mem2 version 2.2.1
[2023-06-18T04:20:01Z INFO bird_tool_utils::external_command_checker] Found samtools version 1.16.1
[2023-06-18T04:20:01Z INFO coverm] Writing BAM files to already existing directory 02_bam_files
[2023-06-18T04:20:01Z INFO coverm::mapping_index_maintenance] BWA index appears to be complete, so going ahead and using it.
[2023-06-18T04:20:01Z INFO coverm] Caching BAM file to 02_bam_files/part1.unbinned_nr_genes_003.ffn.gz.bam
[2023-06-18T04:20:01Z INFO coverm] Caching BAM file to 02_bam_files/part1.unbinned_nr_genes_004.ffn.gz.bam
[2023-06-18T04:20:01Z INFO coverm] Caching BAM file to 02_bam_files/part1.unbinned_nr_genes_005.ffn.gz.bam
[2023-06-18T04:20:01Z INFO coverm::mapping_index_maintenance] BWA index appears to be complete, so going ahead and using it.
[2023-06-18T04:20:01Z INFO coverm] Caching BAM file to 02_bam_files/part2.unbinned_nr_genes_003.ffn.gz.bam
[2023-06-18T04:20:01Z INFO coverm] Caching BAM file to 02_bam_files/part2.unbinned_nr_genes_004.ffn.gz.bam
[2023-06-18T04:20:01Z INFO coverm] Caching BAM file to 02_bam_files/part2.unbinned_nr_genes_005.ffn.gz.bam
[2023-06-18T04:20:30Z INFO coverm::contig] In sample 'part1/unbinned_nr_genes_003.ffn.gz', found 4666 reads mapped out of 567290 total (0.82%)
[2023-06-18T04:20:57Z INFO coverm::contig] In sample 'part1/unbinned_nr_genes_004.ffn.gz', found 4967 reads mapped out of 567387 total (0.88%)
[2023-06-18T04:21:26Z INFO coverm::contig] In sample 'part1/unbinned_nr_genes_005.ffn.gz', found 4801 reads mapped out of 567706 total (0.85%)
[2023-06-18T04:21:54Z ERROR coverm::coverage_takers] Found a difference amongst the reference sets used for mapping. For this (non-streaming) usage of CoverM, all BAM files must have the same set of reference sequences. Previous entry was contig-196890-spa-t-S25_9i6072_2, new is contig-27673-spa-t-S57_2b3234_5
What is the proper usage of coverm in this case, with a 2 part index?