RagTag
RagTag copied to clipboard
Error when running ragtag scaffold command
Hi! I have attached the initial ref.fa and contigs fasta files I am using. Please let me know if there's any other files you need!
When running ragtag correct, I received no error, however when I then ran ragtag scaffold I received the following error:
[denovo_files.zip](https://github.com/malonge/RagTag/files/5294254/denovo_files.zip)
[jvelez@acf-login5 gwas_analyses]$ ragtag.py scaffold -u ref2.fa ragtag_output/denovo_contigs.corrected.fasta
Mon Sep 28 14:14:36 2020 --- RagTag v1.0.1
Mon Sep 28 14:14:36 2020 --- CMD: /nics/b/home/jvelez/miniconda2/bin/ragtag_scaffold.py -u ref2.fa ragtag_output/denovo_contigs.corrected.fasta
Mon Sep 28 14:14:36 2020 --- Mapping the query genome to the reference genome
Mon Sep 28 14:14:36 2020 --- Retaining pre-existing file: /lustre/haven/user/jvelez/gwas_analyses/ragtag_output/query_against_ref.paf
Mon Sep 28 14:14:36 2020 --- Reading whole genome alignments
Mon Sep 28 14:14:36 2020 --- Filtering and merging alignments
Mon Sep 28 14:14:36 2020 --- Ordering and orienting query sequences
Mon Sep 28 14:14:36 2020 --- Writing scaffolds
Mon Sep 28 14:14:36 2020 --- Writing: /lustre/haven/user/jvelez/gwas_analyses/ragtag_output/ragtag.scaffolds.agp
Mon Sep 28 14:14:36 2020 --- Retaining pre-existing file: /lustre/haven/user/jvelez/gwas_analyses/ragtag_output/ragtag.scaffolds.agp
Mon Sep 28 14:14:36 2020 --- Running: ragtag_agp2fasta.py /lustre/haven/user/jvelez/gwas_analyses/ragtag_output/ragtag.scaffolds.agp /lustre/haven/user/jvelez/gwas_analyses/ragtag_output/denovo_contigs.corrected.fasta > /lustre/haven/user/jvelez/gwas_analyses/ragtag_output/ragtag.scaffolds.fasta
Traceback (most recent call last):
File "/nics/b/home/jvelez/miniconda2/bin/ragtag_agp2fasta.py", line 74, in <module>
main()
File "/nics/b/home/jvelez/miniconda2/bin/ragtag_agp2fasta.py", line 67, in main
sys.stdout.write(fai.fetch(agp_line.comp))
TypeError: write() argument must be str, not bytes
Traceback (most recent call last):
File "/nics/b/home/jvelez/miniconda2/bin/ragtag_scaffold.py", line 528, in <module>
main()
File "/nics/b/home/jvelez/miniconda2/bin/ragtag_scaffold.py", line 514, in main
run_o(cmd, output_path + "ragtag.scaffolds.fasta")
File "/nics/b/home/jvelez/miniconda2/lib/python3.6/site-packages/ragtag_utilities/utilities.py", line 91, in run_o
raise RuntimeError('Failed : %s > %s' % (" ".join(cmd), out))
RuntimeError: Failed : ragtag_agp2fasta.py /lustre/haven/user/jvelez/gwas_analyses/ragtag_output/ragtag.scaffolds.agp /lustre/haven/user/jvelez/gwas_analyses/ragtag_output/denovo_contigs.corrected.fasta > /lustre/haven/user/jvelez/gwas_analyses/ragtag_output/ragtag.scaffolds.fasta
Hi there,
At a glance, it looks like the AGP file is incorrectly formatted (perhaps compressed?). Can you delete ragtag.scaffolds.agp
and try it again?
Thanks
Thank you for the reply! I tried that and unfortunately had the same result:
[jvelez@acf-login6 gwas_analyses]$ cd ragtag_output
[jvelez@acf-login6 ragtag_output]$ ls
c_query_against_ref.paf query_against_ref.paf.log
c_query_against_ref.paf.log ragtag.confidence.txt
denovo_contigs.corrected.fasta ragtag.correction.agp
denovo_contigs.corrected.fasta.fai ragtag.scaffolds.agp
query_against_ref.paf ragtag.scaffolds.fasta
[jvelez@acf-login6 ragtag_output]$ rmv ragtag.scaffolds.agp
-bash: rmv: command not found
[jvelez@acf-login6 ragtag_output]$ rm ragtag.scaffolds.agp
[jvelez@acf-login6 ragtag_output]$ ls
c_query_against_ref.paf query_against_ref.paf.log
c_query_against_ref.paf.log ragtag.confidence.txt
denovo_contigs.corrected.fasta ragtag.correction.agp
denovo_contigs.corrected.fasta.fai ragtag.scaffolds.fasta
query_against_ref.paf
[jvelez@acf-login6 ragtag_output]$ cd ..
[jvelez@acf-login6 gwas_analyses]$ ragtag.py scaffold ref2.fa ragtag_output/deno vo_contigs.corrected.fasta
Mon Sep 28 19:43:12 2020 --- RagTag v1.0.1
Mon Sep 28 19:43:12 2020 --- CMD: /nics/b/home/jvelez/miniconda2/bin/ragtag_scaf fold.py ref2.fa ragtag_output/denovo_contigs.corrected.fasta
Mon Sep 28 19:43:12 2020 --- WARNING: Without '-u' invoked, some component/objec t AGP pairs might share the same ID. Some external programs/databases don't like this. To ensure valid AGP format, use '-u'.
Mon Sep 28 19:43:12 2020 --- Mapping the query genome to the reference genome
Mon Sep 28 19:43:12 2020 --- Retaining pre-existing file: /lustre/haven/user/jve lez/gwas_analyses/ragtag_output/query_against_ref.paf
Mon Sep 28 19:43:12 2020 --- Reading whole genome alignments
Mon Sep 28 19:43:12 2020 --- Filtering and merging alignments
Mon Sep 28 19:43:13 2020 --- Ordering and orienting query sequences
Mon Sep 28 19:43:13 2020 --- Writing scaffolds
Mon Sep 28 19:43:13 2020 --- Writing: /lustre/haven/user/jvelez/gwas_analyses/ra gtag_output/ragtag.scaffolds.agp
Mon Sep 28 19:43:19 2020 --- Running: ragtag_agp2fasta.py /lustre/haven/user/jve lez/gwas_analyses/ragtag_output/ragtag.scaffolds.agp /lustre/haven/user/jvelez/g was_analyses/ragtag_output/denovo_contigs.corrected.fasta > /lustre/haven/user/j velez/gwas_analyses/ragtag_output/ragtag.scaffolds.fasta
Traceback (most recent call last):
File "/nics/b/home/jvelez/miniconda2/bin/ragtag_agp2fasta.py", line 74, in <mo dule>
main()
File "/nics/b/home/jvelez/miniconda2/bin/ragtag_agp2fasta.py", line 67, in mai n
sys.stdout.write(fai.fetch(agp_line.comp))
TypeError: write() argument must be str, not bytes
Traceback (most recent call last):
File "/nics/b/home/jvelez/miniconda2/bin/ragtag_scaffold.py", line 528, in <mo dule>
main()
File "/nics/b/home/jvelez/miniconda2/bin/ragtag_scaffold.py", line 514, in mai n
run_o(cmd, output_path + "ragtag.scaffolds.fasta")
File "/nics/b/home/jvelez/miniconda2/lib/python3.6/site-packages/ragtag_utilit ies/utilities.py", line 91, in run_o
raise RuntimeError('Failed : %s > %s' % (" ".join(cmd), out))
RuntimeError: Failed : ragtag_agp2fasta.py /lustre/haven/user/jvelez/gwas_analys es/ragtag_output/ragtag.scaffolds.agp /lustre/haven/user/jvelez/gwas_analyses/ra gtag_output/denovo_contigs.corrected.fasta > /lustre/haven/user/jvelez/gwas_anal yses/ragtag_output/ragtag.scaffolds.fasta
Thanks for trying that. I can't quite think of what might be going on. Basically, pysam is just trying to read the query assembly. sequence but it is getting bytes instead of strings. Does denovo_contigs.corrected.fasta
look like a properly formatted fasta file?
Would you be willing to share ref2.fa
and denovo_contigs.corrected.fasta
? I can run things on my end and try to reproduce the error.
Thanks
Sure! Looking at the corrected file, it looks like a "b" has been added to each line. I'm not sure why this would be though? I'm attaching both the preliminary and post-correct command corrected file for you to take a look. troubleshooting_files.zip
Hi there,
Thanks for sharing the data. It is strange that the sequences are written as bytes. Unfortunately, I was unable to reproduce the error on my end. Perhaps you can try delete everything and start from scratch (including rerunning correction). You can also use -w
to overwrite everything. Before doing that, please double check the installation page to make sure that your using python3 the correct versions for the dependencies.
Thanks
Thank you for taking a look! I'll play around with it.
I had a similar issue on another tool. It might be a python version issue. I could only fix mine by changing the way the script is opening and parsing in the file. See if you can find something similar to this: f= open(outfile, 'wb')
and change it to this f= open(outfile, 'w')
That fixed mine
ragtag_agp2fa.py line71 sys.stdout.write(fai.fetch(agp_line.comp, agp_line.comp_beg-1, agp_line.comp_end)) change to:sys.stdout.write(str(fai.fetch(agp_line.comp, agp_line.comp_beg-1, agp_line.comp_end))) Why sys.stdout.write(reverse_complement(fai.fetch(agp_line.comp, agp_line.comp_beg-1, agp_line.comp_end))) is ok!!!
Hello, I'm also facing this issue with RagTag v2.1.0 installed through conda. I ended up adding an str()
at lines 69 & 71 of ragtag_agp2fa.py as per @wq-ls 's suggestion, and now it prints my fasta file with a 'b' at the front, so it looks like
>RefSeqXXX
b'GATTACA'
I then opened the fasta in VSC and just find & replace all occurences of b'
and '
. Hope this helps! And hope that the developers can solve this soon :)
Hello, I'm also facing this issue with RagTag v2.1.0 installed through conda. I ended up adding an
str()
at lines 69 & 71 of ragtag_agp2fa.py as per @wq-ls 's suggestion, and now it prints my fasta file with a 'b' at the front, so it looks like>RefSeqXXX b'GATTACA'
I then opened the fasta in VSC and just find & replace all occurences of
b'
and'
. Hope this helps! And hope that the developers can solve this soon :)
Hey, bro, try this agp2fa.pl.
I also modified the script according to your way, as shown in the picture below, but unfortunately, there was a new error. Could you please help me check it ? if agp_line.orientation == "-": sys.stdout.write(reverse_complement(fai.fetch(agp_line.comp, agp_line.comp_beg-1, agp_line.comp_end))) else: sys.stdout.write(str(fai.fetch(agp_line.comp, agp_line.comp_beg-1, agp_line.comp_end)))
new.error: File "/share/home/ouc_tengmx/miniconda3/envs/lhy/lib/python3.6/site-packages/ragtag_utilities/utilities.py", line 126, in run_oae raise RuntimeError('Failed : %s > %s 2> %s. Check stderr file for details.' % (" ".join(cmd), out, err)) RuntimeError: Failed : ragtag_agp2fa.py /share/home/ouc_tengmx/lhy/ragtag_output/ragtag_output/ragtag.scaffold.agp /share/home/ouc_tengmx/lhy/ragtag_output/ragtag.correct.fasta > /share/home/ouc_tengmx/lhy/ragtag_output/ragtag_output/ragtag.scaffold.fasta 2> /share/home/ouc_tengmx/lhy/ragtag_output/ragtag_output/ragtag.scaffold.err. Check stderr file for details.
@winanonanona @wq-ls
lib
I have no idea. In my opinion, you don't really have to worry about it. You just have a problem with the final step of converting the agp file into the final Scaffold.fasta. You have the correct agp file, just convert your original contig.genome.fa. Trying another script https://github.com/fanagislab/EndHiC/blob/master/agp2fasta.pl . (Usage: perl agp2fasta.pl gfa.cluster.agp contigs.fasta > gfa.cluster.agp.fasta)
lib
I have no idea. In my opinion, you don't really have to worry about it. You just have a problem with the final step of converting the agp file into the final Scaffold.fasta. You have the correct agp file, just convert your original contig.genome.fa. Trying another script https://github.com/fanagislab/EndHiC/blob/master/agp2fasta.pl . (Usage: perl agp2fasta.pl gfa.cluster.agp contigs.fasta > gfa.cluster.agp.fasta)
@oucstar
if agp_line.orientation == "-":
sys.stdout.write(reverse_complement(str(fai.fetch(agp_line.comp, agp_line.comp_beg-1, agp_line.comp_end))))
else:
sys.stdout.write(str(fai.fetch(agp_line.comp, agp_line.comp_beg-1, agp_line.comp_end)))
Try this? so the str
is both in line 69 and 71. @oucstar
@winanonanona ,First of all, Thank you very much for your reply. I did it according to this operation, but there are still the following mistakes :
Traceback (most recent call last):
File "/share/home/ouc_tengmx/miniconda3/envs/lhy/bin/ragtag_agp2fa.py", line 7
9, in
@wq-ls ,Thank you for your reply, but the fasta file I generated is small and doesn't seem quite correct :
Traceback (most recent call last):
File "/share/home/ouc_tengmx/miniconda3/envs/lhy/bin/ragtag_agp2fa.py", line 7
9, in
I was able to fix the byte/string issue by installing RagTag v2.1.0 in a conda environment with an older version of python (3.5).
@velezjm @malonge Do you have solve the problems? I also encountered the issues, and I fonud that the query fasta index is old. So I re-generated fasta faidx, and re-run the ragtag.py scaffold
command, and the issue was solved.
I was able to fix the byte/string issue in a conda environment with python version 3.10.13.
fixed it by:
ragtag_agp2fa.py line71
sys.stdout.write(fai.fetch(agp_line.comp, agp_line.comp_beg-1, agp_line.comp_end))
change to:
sys.stdout.write(fai.fetch(agp_line.comp, agp_line.comp_beg-1, agp_line.comp_end).decode())