japsa icon indicating copy to clipboard operation
japsa copied to clipboard

jsa.np.rtResistGenes - tmp folder not automatically generated, and _ao.fasta not produced when tmp folder made in advance

Open dn-ra opened this issue 3 years ago • 1 comments

Hi, I'm having an issue running the resistance gene identifier when running it on pre-basecalled reads. Without making the tmp directory in advance, I get a FileNotFoundException for tmp/resFind_JSA_4_2_ai.fasta But when I do make the tmp directory in advance, it seems like the tmp/resFind_JSA_4_2_ao.fasta file isn't created and I get another FileNotFoundException when it tries to read in the msa output.

Command looks like this:

cat $fq | bwa mem -t 8 -k11 -W20 -r10 -A1 -B1 -O1 -E1 -L0 -Y -K 10000 -a $rdb/DB.fasta $fq 2> /dev/null \ | jsa.np.rtResistGenes -bam - -score=0.0001 -time 300 -read 0 --resDB $rdb -tmp tmp/resFind -o resistance.dat -thread 7 2> resistance.log

logs are here:

without making tmp dir: [main] INFO japsa.bio.np.RealtimeResistanceGene - geneList = 280 [main] INFO japsa.bio.np.RealtimeResistanceGene - geneMap = 280 [main] INFO japsa.bio.np.RealtimeResistanceGene - gene2Group = 609 [main] INFO japsa.bio.np.RealtimeResistanceGene - gene2GeneName = 609 [main] INFO japsa.bio.np.RealtimeResistanceGene - Resistance identification ready at Mon Mar 01 02:09:00 UTC 2021 [SSS] INFO japsa.bio.np.RealtimeAnalysis - Start analysing data at Mon Mar 01 02:09:00 UTC 2021 [SSS] INFO japsa.bio.np.RealtimeResistanceGene - ===Found 0 vs 280 0 with 0 [SSS] INFO japsa.bio.np.RealtimeAnalysis - RUNTIME Mon Mar 01 02:09:00 UTC 2021 0.001 0 0.085 [SSS] INFO japsa.bio.np.RealtimeAnalysis - Not due time, sleep for 299.885 seconds [main] INFO japsa.bio.np.RealtimeAnalysis - All reads received at Mon Mar 01 02:12:06 UTC 2021 [main] INFO japsa.bio.np.RealtimeResistanceGene - END : Mon Mar 01 02:12:06 UTC 2021 java.io.FileNotFoundException: tmp/resFind_JSA_4_2_ai.fasta (No such file or directory) at java.base/java.io.FileOutputStream.open0(Native Method) at java.base/java.io.FileOutputStream.open(FileOutputStream.java:298) at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:237) at java.base/java.io.FileOutputStream.<init>(FileOutputStream.java:126) at japsa.seq.SequenceOutputStream.makeOutputStream(SequenceOutputStream.java:117) at japsa.bio.np.ErrorCorrection.writeAlignmentToFaiFile(ErrorCorrection.java:111) at japsa.bio.np.ErrorCorrection.consensusSequence(ErrorCorrection.java:297) at japsa.bio.np.ErrorCorrection.consensusSequence(ErrorCorrection.java:103) at japsa.bio.np.RealtimeResistanceGene$ResistanceGeneFinder.antiBioticsProfile(RealtimeResistanceGene.java:361) at japsa.bio.np.RealtimeResistanceGene$ResistanceGeneFinder.antiBioticAnalysis(RealtimeResistanceGene.java:335) at japsa.bio.np.RealtimeResistanceGene$ResistanceGeneFinder.analysis(RealtimeResistanceGene.java:463) at japsa.bio.np.RealtimeAnalysis.run(RealtimeAnalysis.java:116) at java.base/java.lang.Thread.run(Thread.java:834) [SSS] INFO japsa.bio.np.RealtimeAnalysis - RUNTIME Mon Mar 01 02:14:00 UTC 2021 300.009 307288 0.007 [SSS] INFO japsa.bio.np.RealtimeAnalysis - Real time analysis done

with making tmp dir: [main] INFO japsa.bio.np.RealtimeResistanceGene - geneList = 280 [main] INFO japsa.bio.np.RealtimeResistanceGene - geneMap = 280 [main] INFO japsa.bio.np.RealtimeResistanceGene - gene2Group = 609 [main] INFO japsa.bio.np.RealtimeResistanceGene - gene2GeneName = 609 [main] INFO japsa.bio.np.RealtimeResistanceGene - Resistance identification ready at Mon Mar 01 01:02:55 UTC 2021 [SSS] INFO japsa.bio.np.RealtimeAnalysis - Start analysing data at Mon Mar 01 01:02:55 UTC 2021 [SSS] INFO japsa.bio.np.RealtimeResistanceGene - ===Found 0 vs 280 0 with 0 [SSS] INFO japsa.bio.np.RealtimeAnalysis - RUNTIME Mon Mar 01 01:02:55 UTC 2021 0.001 0 0.101 [SSS] INFO japsa.bio.np.RealtimeAnalysis - Not due time, sleep for 299.872 seconds [main] INFO japsa.bio.np.RealtimeAnalysis - All reads received at Mon Mar 01 01:06:02 UTC 2021 [main] INFO japsa.bio.np.RealtimeResistanceGene - END : Mon Mar 01 01:06:02 UTC 2021 [SSS] INFO japsa.bio.np.ErrorCorrection - 8ddcabfa-8fba-4831-8327-2b940bd5f813_r_890_1639 750 [SSS] INFO japsa.bio.np.ErrorCorrection - 36f3366a-35ac-4428-9561-f2ab368976df_803_1619 817 [SSS] INFO japsa.bio.np.ErrorCorrection - Running [kalign, -gpo, 60, -gpe, 10, -tgpe, 0, -bonus, 0, -q, -i, tmp/resFind_JSA_4_2_ai.fasta, -o, tmp/resFind_JSA_4_2_ao.fasta] [SSS] INFO japsa.bio.np.ErrorCorrection - Done [kalign, -gpo, 60, -gpe, 10, -tgpe, 0, -bonus, 0, -q, -i, tmp/resFind_JSA_4_2_ai.fasta, -o, tmp/resFind_JSA_4_2_ao.fasta] java.io.FileNotFoundException: tmp/resFind_JSA_4_2_ao.fasta (No such file or directory) at java.base/java.io.FileInputStream.open0(Native Method) at java.base/java.io.FileInputStream.open(FileInputStream.java:219) at java.base/java.io.FileInputStream.<init>(FileInputStream.java:157) at java.base/java.io.FileInputStream.<init>(FileInputStream.java:112) at japsa.seq.SequenceReader.getReader(SequenceReader.java:299) at japsa.bio.np.ErrorCorrection.readMultipleAlignment(ErrorCorrection.java:235) at japsa.bio.np.ErrorCorrection.consensusSequence(ErrorCorrection.java:314) at japsa.bio.np.ErrorCorrection.consensusSequence(ErrorCorrection.java:103) at japsa.bio.np.RealtimeResistanceGene$ResistanceGeneFinder.antiBioticsProfile(RealtimeResistanceGene.java:361) at japsa.bio.np.RealtimeResistanceGene$ResistanceGeneFinder.antiBioticAnalysis(RealtimeResistanceGene.java:335) at japsa.bio.np.RealtimeResistanceGene$ResistanceGeneFinder.analysis(RealtimeResistanceGene.java:463) at japsa.bio.np.RealtimeAnalysis.run(RealtimeAnalysis.java:116) at java.base/java.lang.Thread.run(Thread.java:834) [SSS] INFO japsa.bio.np.RealtimeAnalysis - RUNTIME Mon Mar 01 01:07:55 UTC 2021 300.009 307288 0.436 [SSS] INFO japsa.bio.np.RealtimeAnalysis - Real time analysis done

dn-ra avatar Mar 01 '21 02:03 dn-ra

identified cause of problem around ao.fasta not being generated. kalign is hardcoded as the default msa software at line 72 of ErrorCorrection.java

kalign and kalign3 accept different paramaters, so with kalign arguments being input to a kalign3 system call the msa fails and the pipeline falls over. Suggest improved error handling of the msa process

dn-ra avatar Mar 22 '21 07:03 dn-ra