BESST icon indicating copy to clipboard operation
BESST copied to clipboard

RE: reads_to_ctg_map.py

Open Malabady opened this issue 8 years ago • 10 comments

Hi, the script "reads_to_ctg_map.py" fails at sam sort stage and give the following message" Convert SAM to BAM... Done. Time elapsed for SAM to BAM conversion: 0:03:52.958605

Sort BAM...Traceback (most recent call last): File "/home/malabady/Programs/local/scaffolding-tools/BESST/scripts/reads_to_ctg_map.py", line 319, in tmp_path, args.bwa_path, args.clear) File "/home/malabady/Programs/local/scaffolding-tools/BESST/scripts/reads_to_ctg_map.py", line 186, in bwa_mem pysam.sort(bwa_output + ".bam", output_path) File "/usr/local/lib/python2.7/dist-packages/pysam/utils.py", line 65, in call "\n".join(stderr))) pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=[bam_sort] Use -T PREFIX / -o FILE to specify temporary and final output files\n\nUsage: samtools sort [options...] [in.bam]\n\nOptions:\n\n -l INT Set compression level, from 0 (uncompressed) to 9 (best)\n\n -m INT Set maximum memory per thread; suffix K/M/G recognized [768M]\n\n -n Sort by read name\n\n -o FILE Write final output to FILE rather than standard output\n\n -T PREFIX Write temporary files to PREFIX.nnnn.bam\n\n -@, --threads INT\n\n Set number of sorting and compression threads [1]\n\n --input-fmt-option OPT[=VAL]\n\n Specify a single input file format option in the form\n\n of OPTION or OPTION=VALUE\n\n -O, --output-fmt FORMAT[,OPT[=VAL]]...\n\n Specify output format (SAM, BAM, CRAM)\n\n --output-fmt-option OPT[=VAL]\n\n Specify a single output file format option in the form\n\n of OPTION or OPTION=VALUE\n\n --reference FILE\n\n Reference sequence FASTA FILE [null]\n'

Any suggestions what is wrong? Thanks Magdy

Malabady avatar Jun 13 '16 17:06 Malabady

Hi,

Seems to be something wrong with the command passed to samtools. Please provide me with the command line you were using when calling reads_to_ctg_map.py.

ksahlin avatar Jun 13 '16 18:06 ksahlin

Thanks! here is it

reads_to_ctg_map.py --threads 2
/home/malabady/Projects/Ab10/Ab10Only_reads/reads/passQC_corr/mp6k75_good_1.cor.fastq.gz
/home/malabady/Projects/Ab10/Ab10Only_reads/reads/passQC_corr/mp6k75_good_2.cor.fastq.gz
/home/malabady/Projects/Ab10/Ab10Only_reads/assembly/spades/scaffolds.min0.5k.fasta
mp6k75.bwa.bam \

Malabady avatar Jun 13 '16 18:06 Malabady

theres is no back slash are the end of the last line.

Malabady avatar Jun 13 '16 19:06 Malabady

It seems that you're using an old version of that script. Because the lines in the bug report provided by python doesn't match the errors on the current script. I would suggest you to update the script, or preferably the whole repository if you intend to scaffold with BESST. Please let me know if mapping with the newest version of the script also returns an error.

ksahlin avatar Jun 13 '16 19:06 ksahlin

Just got a clone of BESST using git clone and rerun the script again but it produced the same error (see below.) earlier I was looking up this error online and I found a suggested solution at this link (https://github.com/pysam-developers/pysam/issues/291). They say the error is due to the new output parameters of the samtools. I modified the pysam.sort line in the script accordingly (just added "-o" in the pysam.sort statement), but it didn't solve the problem. they the type of error was different though (see the second passage below.)


Sort BAM...Traceback (most recent call last): File "/home/malabady/Programs/local/scaffolding-tools/BESST/scripts/reads_to_ctg_map.py", line 317, in tmp_path, args.bwa_path, args.clear) File "/home/malabady/Programs/local/scaffolding-tools/BESST/scripts/reads_to_ctg_map.py", line 184, in bwa_mem pysam.sort(bwa_output + ".bam", output_path) File "/usr/local/lib/python2.7/dist-packages/pysam/utils.py", line 65, in call "\n".join(stderr)))

pysam.utils.SamtoolsError: 'samtools returned with error 1: stdout=, stderr=[bam_sort] Use -T PREFIX / -o FILE to specify temporary and final output files\n\nUsage: samtools sort [options...] [in.bam]\n\nOptions:\n\n -l INT Set compression level, from 0 (uncompressed) to 9 (best)\n\n -m INT Set maximum memory per thread; suffix K/M/G recognized [768M]\n\n -n Sort by read name\n\n -o FILE Write final output to FILE rather than standard output\n\n -T PREFIX Write temporary files to PREFIX.nnnn.bam\n\n -@, --threads INT\n\n Set number of sorting and compression threads [1]\n\n --input-fmt-option OPT[=VAL]\n\n Specify a single input file format option in the form\n\n of OPTION or OPTION=VALUE\n\n -O, --output-fmt FORMAT[,OPT[=VAL]]...\n\n Specify output format (SAM, BAM, CRAM)\n\n --output-fmt-option OPT[=VAL]\n\n Specify a single output file format option in the form\n\n of OPTION or OPTION=VALUE\n\n --reference FILE\n\n Reference sequence FASTA FILE [null]\n'

---error after modifying the script---- Sort BAM...[E::hts_hopen] fail to open file 'mp6k75.bwa.bam' [E::hts_open_format] fail to open file 'mp6k75.bwa.bam' Traceback (most recent call last): File "/home/malabady/Programs/local/scaffolding-tools/BESST/scripts/reads_to_ctg_map.py", line 319, in tmp_path, args.bwa_path, args.clear) File "/home/malabady/Programs/local/scaffolding-tools/BESST/scripts/reads_to_ctg_map.py", line 184, in bwa_mem pysam.sort("-o", bwa_output + ".bam", output_path) File "/usr/local/lib/python2.7/dist-packages/pysam/utils.py", line 65, in call "\n".join(stderr))) pysam.utils.SamtoolsError: "samtools returned with error 1: stdout=, stderr=[bam_sort_core] fail to open 'mp6k75.bwa.bam': Is a directory\n"

Malabady avatar Jun 13 '16 19:06 Malabady

Great that you solved it, I will implement this fix. The new bug looks easy to solve, mp6k75.bwa.bam seems to be a directory (last line in error message). Remove this folder, or alternatively, specify a new output name.

ksahlin avatar Jun 13 '16 19:06 ksahlin

But mp6k75.bwa.bam dir is generated by the script during the run. I didn't make it. I think the script the need to be modified not to produce this folder!!

Malabady avatar Jun 13 '16 20:06 Malabady

this part: pysam.sort("-o", bwa_output + ".bam", output_path)

Malabady avatar Jun 13 '16 20:06 Malabady

I have the same issue, latest version of BESST. The problem is as @Malabady wrote, that the samtool API was changed (in v.0.9.0 probably). The following line fixed the issue and sorting the bam file works: pysam.sort("-o", output_path + ".bam", bwa_output + ".bam") but it fails downstream:

Sort BAM...Done.
Time elapsed for BAM sorting: 0:00:00.089828

Index BAM...
Traceback (most recent call last):
  File "/home/kaiimalab/gatb-minia-pipeline/BESST/scripts/reads_to_ctg_map.py", line 317, in <module>
    tmp_path, args.bwa_path, args.clear)
  File "/home/kaiimalab/gatb-minia-pipeline/BESST/scripts/reads_to_ctg_map.py", line 193, in bwa_mem
    stdout.flush()
ValueError: I/O operation on closed file

Probably pysam v.0.9.0 has more changes. I am not a python programmer, so I stopped debugging here. please add support to pysam v.0.9.0 thanks Reut

reutha avatar Aug 11 '16 09:08 reutha

I also got similar problem using latest BESST script. My samtools was: samtools 1.3.1-41-gce4a601 Using htslib 1.3.2-135-g50db54b

Then I changed the problematic line on reads_to_ctg_map.py into: pysam.sort("-O", "BAM", "-o", output_path + ".bam", bwa_output + ".bam")

I tested on a small dataset and it ran fine.

habibr avatar Nov 16 '16 04:11 habibr