progressiveCactus icon indicating copy to clipboard operation
progressiveCactus copied to clipboard

NameError: global name 'refFileName' is not defined

Open dbrowneup opened this issue 7 years ago • 4 comments

Hello, I'm trying to run a multiple genome alignment with the following parameters:

runProgressiveCactus.sh SEQUENCE_FILE_v2.txt ./CACTUS_v2 ./Chlorophyta_v2 \
    --maxThreads 20 --root Chlorophyta

And the following sequence file:

((((((((((((((Physcomitrella_patens)Physcomitrella)Funariaceae)Funariales)Funariidae)Bryopsida)Bryophytina)Bryophyta,((((((((((((((Brassica_rapa)Brassica)Brassiceae,((Arabidopsis_thaliana)Arabidopsis)Camelineae)Brassicaceae)Brassicales)malvids)rosids)Pentapetalae)Gunneridae)eudicotyledons,((((((((((Oryza_sativa)Oryza)Oryzinae)Oryzeae)Oryzoideae)BOP_clade)Poaceae)Poales)commelinids)Petrosaviidae)Liliopsida)Mesangiospermae)Magnoliophyta)Spermatophyta)Euphyllophyta)Tracheophyta)Embryophyta)Streptophytina)Streptophyta,(((((Coccomyxa_subellipsoidea)Coccomyxa)Coccomyxaceae,((Botryococcus_braunii)Botryococcus)Botryococcaceae)Trebouxiophyceae_incertae_sedis)Trebouxiophyceae,(((((Micromonas_pusilla)Micromonas)Mamiellaceae,((Ostreococcus_lucimarinus)Ostreococcus)Bathycoccaceae)Mamiellales)Mamiellophyceae)prasinophytes,((((Volvox_carteri)Volvox)Volvocaceae,((Dunaliella_salina)Dunaliella)Dunaliellaceae,((Chlamydomonas_reinhardtii)Chlamydomonas)Chlamydomonadaceae)Chlamydomonadales)Chlorophyceae)Chlorophyta)Viridiplantae)Eukaryota)cellular_organisms);
Volvox_carteri /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/Vcarteri_317_v2.fa
*Chlamydomonas_reinhardtii /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/Creinhardtii_281_v5.0.fa
Dunaliella_salina /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/Dsalina_325_v1.fa
*Ostreococcus_lucimarinus /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/Olucimarinus_231_v2.0.fa
*Micromonas_pusilla /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/MpusillaCCMP1545_228_v3.0.fa
*Coccomyxa_subellipsoidea /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/CsubellipsoideaC169_227_v2.0.fa
Botryococcus_braunii /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/Scaffolds-pass4.broken.0x.fa
Physcomitrella_patens /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/Ppatens_318_v3.fa
Brassica_rapa /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/BrapaFPsc_277_v1.fa
Arabidopsis_thaliana /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/Athaliana_167_TAIR9.fa
*Oryza_sativa /scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/Genomes/Osativa_323_v7.0.fa

The program fails almost immediately with the following error:

Traceback (most recent call last):
  File "/software/hprc/Bio/progressiveCactus/submodules/cactus/bin/cactus_createMultiCactusProject.py", line 385, in <module>
    main()
  File "/software/hprc/Bio/progressiveCactus/submodules/cactus/bin/cactus_createMultiCactusProject.py", line 379, in main
    createFileStructure(mcProj, expTemplate, confTemplate, options)
  File "/software/hprc/Bio/progressiveCactus/submodules/cactus/bin/cactus_createMultiCactusProject.py", line 268, in createFileStructure
    ogPath = os.path.join(ogPath, refFileName(og))
NameError: global name 'refFileName' is not defined
Error: Command: cactus_createMultiCactusProject.py "/scratch/user/dbrowne/2017.09_SEP/2017.09.26_Algal_Genome_Alignment_v2/CACTUS_v2/expTemplate.xml" "./CACTUS_v2/progressiveAlignment" --fixNames=0 --outgroupNames Chlamydomonas_reinhardtii,Ostreococcus_lucimarinus,Micromonas_pusilla,Coccomyxa_subellipsoidea,Oryza_sativa --root Chlorophyta exited with non-zero status 1

I've been able to run the program successfully on these genomes when I don't include the Newick tree. Is there a problem with my Newick tree or is this just a bug in the program? Thanks in advance for your help.

dbrowneup avatar Sep 26 '17 17:09 dbrowneup

A little digging around led me to find this:

https://github.com/ComparativeGenomicsToolkit/cactus/commit/bbd44e547b6e4042503e85cfe5457a66a9bc2115

Looks like progressiveCactus needs to pull a more recent commit from cactus

dbrowneup avatar Sep 26 '17 17:09 dbrowneup

Good point. You may be able to work around by removing internal node names from your tree (so only names with associated fastas appear).

I'll try to get the latest Cactus in soon. Unfortunately it's not as easy as just updating the submodule commit, as it looks like the Makefile will need to be changed around to deal with cactus now containing its own submodules..

On Tue, Sep 26, 2017 at 1:56 PM, Dan Browne [email protected] wrote:

A little digging around led me to find this:

ComparativeGenomicsToolkit/cactus@bbd44e5 https://github.com/ComparativeGenomicsToolkit/cactus/commit/bbd44e547b6e4042503e85cfe5457a66a9bc2115

Looks like progressiveCactus needs to pull a more recent commit from cactus

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/glennhickey/progressiveCactus/issues/87#issuecomment-332282112, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2_7izML7z-nHinlHq6_oEt_Am-Varmks5smTrQgaJpZM4Pklnb .

glennhickey avatar Sep 26 '17 18:09 glennhickey

Hi Dan,

Thanks for the detailed bug report. I'm not sure the changes from that commit will help you much--it would still crash, but in a different way and with a more helpful message.

It looks like this is caused to a bug in the way we handle trees with non-branching internal nodes. It looks like your tree may have been auto-generated (maybe from the NCBI taxonomy?) which has left a bunch of non-branching nodes. Cactus doesn't really work well with these trees with "extra" nodes. Every internal node is treated as an alignment subproblem no matter what, so having a guide tree with non-branching nodes just introduces a bunch of alignment work for no benefit (and possibly a detriment).

Anyway, it looks like there's a bug with the way we assign outgroups in these kind of trees, which was never caught since we try to avoid them. I'll try to fix that soon.

Sorry that it's not mentioned anywhere that including these single-degree nodes is usually a bad idea. I don't think we ever thought about auto-generated trees. I'll make future versions output a warning when an input tree looks like that.

Here is the equivalent (hopefully) tree without the non-branching nodes, keeping the existing polytomies at Chlorophyta and Chlamydomonadales:

((Physcomitrella_patens,((Brassica_rapa,Arabidopsis_thaliana)Brassicaceae,Oryza_sativa)Mesangiospermae)Embryophyta,((Coccomyxa_subellipsoidea,Botryococcus_braunii)Trebouxiophyceae_incertae_sedis,(Micromonas_pusilla,Ostreococcus_lucimarinus)Mamiellales,(Volvox_carteri,Dunaliella_salina,Chlamydomonas_reinhardtii)Chlamydomonadales)Chlorophyta)Viridiplantae;

Using that (or a similar tree without the "extra" nodes) should work.

PS: Glenn, don't bother updating the cactus to latest master branch. Recently I merged the toil branch, which includes the "progressive" parts of progressiveCactus, though it's not 100% ready for release yet. I think it would be a lot of effort to try to make it work within the existing progressiveCactus structure. I've just updated the cactus submodule to point to the last progressiveCactus-compatible commit of cactus (bdc04d09a98c8f9874f44f0730a99bf3a74356a8), though, in case there were some important bugfixes.

joelarmstrong avatar Sep 26 '17 22:09 joelarmstrong

Hi guys, thanks for your feedback! The tree was indeed autogenerated using phyloT, as I just wanted a quick and dirty tree for the progressive alignment. Didn't realize the internal nodes would mess it up. Thanks Joel for the tree without the single nodes. I'm running an alignment right now using that tree. Previously I ran progressiveCactus without a phylogenetic tree and without any outgroup genomes. But that didn't seem to work too well with Ragout, so hopefully this time I will get some better results.

dbrowneup avatar Sep 28 '17 19:09 dbrowneup