Troubleshooting ruby script for using --min-identity when plotting
Hi there, I'm hoping you don't mind helping me troubleshoot how to use asgart-plot to only plot the identified duplications filtered for a minimum % identity.
I successfully used asgart-extract to modify the json output to include the sequence from the FASTA, but am running into issues with the ruby script asgart-align.rb you attached in issue #4
The output I am getting is:
Aligning combo-x_test.json
0/59
asgart-align.rb:45:in `block (2 levels) in <main>': undefined method `size' for nil (NoMethodError)
raise "Error: out[0].size != out[1].size" if out[0].size != out[1].size
^^^^^
from asgart-align.rb:26:in `each'
from asgart-align.rb:26:in `block in <main>'
from asgart-align.rb:24:in `each'
from asgart-align.rb:24:in `each_with_index'
from asgart-align.rb:24:in `<main>'
From what I can tell, the issue is the output json asgart is giving has different lengths from each of the duplicons, the left and right do not match in length. I am only interested in the highest identity duplicates and expect that in these high-identity duplicates the actual length of each arm should be very close if not identical. Do you have any suggestions for what may be happening to cause this, or a workaround? Happy to send the json file if it would help.
Thanks, I am really excited that it seems to be working well, and does plot the data correctly if I don't restrict based on identity.
Cam you pleaseattach your fasta & json files so that I can take a deeper look?
combo-x_test.json.gz Here is the json file, which had asgart-extract ran to insert the sequence! The fasta it was generated from is bigger than 25MB even when gzipped so github won't let me upload - if you need that (it is only a two-line fasta of two concatenated X chromosomes) let me know an alternate way to send to you.
Thanks!
The good news is that I run your file on my machine, and it works fine.
The bad news is that it seems mafft is returning garbage on yours. Would you mind removing the >2 /dev/null line 32, so that we can see if mafft outputs some errors?
Ah, I see! I have MAFFT loaded as a module and is version 7.310 if that is potentially contributing. Here is the output when I remove ">2/dev/null" from line 32:
asgart-align.rb: --> asgart-align.rb
expected a closing delimiter for the %x or backtick string
24 result["families"].each_with_index do |family, i|
26 family.each do |sd|
32 mafft_out = %x(#{MAFFT} --auto #{fasta.path} 33 out, frag = [], "" 42 out << frag unless frag.empty? 67 end 68 end
asgart-align.rb:74: unterminated string meets end of file (SyntaxError) asgart-align.rb:74: syntax error, unexpected end-of-input, expecting `end' or dummy end
On Thu, May 9, 2024 at 3:42 AM delehef @.***> wrote:
The good news is that I run your file on my machine, and it works fine.
The bad news is that it seems mafft is returning garbage on yours. Would you mind removing the >2 /dev/null line 32, so that we can see if mafft outputs some errors?
— Reply to this email directly, view it on GitHub https://github.com/delehef/asgart/issues/6#issuecomment-2102129295, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQZ3CT4IZV42BOP2PRGYUJLZBMSEZAVCNFSM6AAAAABHLYY5QOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBSGEZDSMRZGU . You are receiving this because you authored the thread.Message ID: @.***>
You are missing a closing parentheses at the end of line 32.
Hi @delehef Sorry for the really late reply! Even after fixing the missing parenthesis I still was having issues, but it was determined that it was due to something with the mafft binaries on the HPC I use.
In case others have similar problems with mafft, the fix was for our IT to install MAFFT as a module, and when i loaded both MAFFT and ruby as modules with lmod it now works beautifully.
Thanks for the great program! I will close this issue as it's now resolved, but was also wondering if it would be possible to implement a --max-identity flag when plotting. I would love to be able to plot a range rather than only above a minimum % identity - i.e. all duplicons from 70-80% identity, but not above 80% or <70% for example. Thanks again for the help!
if it would be possible to implement a --max-identity flag when plotting
That's a nice idea, I just added it!