Error in topiary-seed-to-alignment
I attempted to create an alignment from a seed of six sequences from four species (this is my input csv file):
species,name,aliases,sequence,accession Homo sapiens,TTHY_HUMAN,hTTR,GPTGTGESKCPLMVKVLDAVRGSPAINVAVHVFRKAADDTWEPFASGKTSESGELHGLTTEEEFVEGIYKVEIDTKSYWKALGISPFHEHAEVVFTANDSGPRRYTIAALLSPYSYSTTAVVTNPKE,P02766 Saccoglossus kowalevskii,D1LXG7,Acorn worm HIUase,MSGYRIDILTNHLRASQAHSNLIEAVNMAGQQSPLTTHVLDTALGRPAAELPITLYSRSPEMAWLKIAAGKTNQDGRCPGLLTQETFHNGVYKIHFDTGTYHKALDTPGFYPYVEVVFEIHDPNQHYHVPLLLSPFSYSTYRGS,D1LXG7 Danio rerio,HIUH_DANRE,Danio Rerio HIUase,MNRLQHIRGHIVSADKHINMSATLLSPLSTHVLNIAQGVPGANMTIVLHRLDPVSSAWNILTTGITNDDGRCPGLITKENFIAGVYKMRFETGKYWDALGETCFYPYVEIVFTITNTSQHYHVPLLLSRFSYSTYRGS,Q06S87 Mus musculus,HIUH_MOUSE,Mouse HIUase,MATESSPLTTHVLDTASGLPAQGLCLRLSRLEAPCQQWMELRTSYTNLDGRCPGLLTPSQIKPGTYKLFFDTERYWKERGQESFYPYVEVVFTITKETQKFHVPLLLSPWSYTTYRGS,Q9CRB3 Mus musculus,TTHY_MOUSE,Mouse Transthyretin,GPAGAGESKCPLMVKVLDAVRGSPAVDVAVKVFKKTSEGSWEPFASGKTAESGELHGLTTDEKFVEGVYRVELDTKSYWKTLGISPFHEFADVVFTANDSGHRHYTIAALLSPYSYSTTAVVSNPQN,P07309
It seems it worked until the reciprocal blast, then I got the following error (it did create a blast results xml file and a initial dataframe file with 3414 lines):
==========
Building initial topiary dataframe.
BLASTing against NCBI database nr Performing 5 BLAST queries against the NCBI nr database on 1 threads. Depending on the server load, this could take awhile. This is a good time to grab a cup of coffee.
BLAST query complete.
Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Could not parse line MET VARIANT TO 1.7 ANGSTROMS RESOLUTION [Homo sapiens]. Skipping. Downloading 69 blocks of ~50 sequences... 100%|███████████████████████████████████████████| 69/69 [00:48<00:00, 1.42it/s] Getting OTT species ids for all species.
Unknown/unrecognized query ids (skipped): ott4992270 ott615879 ott7659998 ott773491 ott838061 ott898631
Doing reciprocal blast.
Downloading Danio rerio proteome Downloading proteome for taxid '7955' Process Process-11: Traceback (most recent call last): File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/ftp.py", line 36, in _ftp_thread ftp.retrbinary(cmd="RETR " + file_name, File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 445, in retrbinary return self.voidresp() ^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 259, in voidresp resp = self.getresp() ^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 244, in getresp resp = self.getmultiline() ^^^^^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 230, in getmultiline line = self.getline() ^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 218, in getline raise EOFError EOFError Traceback (most recent call last): File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/pipeline/seed_to_alignment.py", line 406, in seed_to_alignment proteome_list.append(topiary.ncbi.get_proteome(taxid=this_taxid)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/proteome.py", line 217, in get_proteome ncbi_ftp_download(genome_url,file_base="_protein.faa.gz") File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/download.py", line 80, in ncbi_ftp_download md5_dict = _read_md5_file(md5_file) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/download.py", line 33, in _read_md5_file file = col[1][2:].strip() ~~~^^^ IndexError: list index out of range
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function ret = fcn(**fcn_args.dict) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'seed_to_alignment'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/lucas/miniconda3/envs/topiary/bin/topiary-seed-to-alignment", line 26, in
Function seed_to_alignment raised an error.
To see command line help, run topiary-seed-to-alignment --help
Thanks for the bug report! I've never seen this one before. It looks to me like it is choking when downloading and reading the checksum file to validate the downloaded proteome. Is there a file called md5checksums.txt in the working directory? If so, could you paste its contents here?
Thanks for your help; hopefully we can resolve this quickly.
It does, but it is too long to be copied here. Let me know if the entire file is needed and I'll post it somewhere, here's its top and last lines:
d0e8e6b5c981ff948c657166270a7c88 ./Annotation_comparison/GCF_000002035.6_GRCz11_compare_prev.gbp.gz 9c8cd6fefb81746909c5438c5d18b758 ./Annotation_comparison/GCF_000002035.6_GRCz11_compare_prev.txt.gz ae405e37cdd4ebbd7d2032baf3e522fd ./annotation_hashes.txt c249b22d4cf0941cf13f6d626140686c ./GCF_000002035.6_GRCz11_assembly_regions.txt ce31297f9cb1eccf885afab7fac363ad ./GCF_000002035.6_GRCz11_assembly_report.txt 83c34be20a52645e7ec5e442e33d1ebf ./GCF_000002035.6_GRCz11_assembly_stats.txt f1de7661b5de92ddf4f2de72a7f2695f ./GCF_000002035.6_GRCz11_assembly_structure/all_alt_scaffold_placement.txt 9585b8ac806c110688debcea17387efa ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/AGP/alt.scaf.agp.gz 20b5e35f4c1033ce0426af1c190fe43b ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394460.1_NC_007112.7.asn 68ae916c656500b25ee32299657b080b ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394460.1_NC_007112.7.gff
(...)
020d65335491f47fa29beadf092e8695 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395039.1_NC_007129.7.asn 711757eada6306b792ed3465a27cdd85 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395039.1_NC_007129.7.gff 527ec55abb1a90ae0cdaac1426704c7d ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395040.1_NC_007129.7.asn d52fb0e9b5f4ee8c2bdb967e539b857e ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395040.1_NC_007129.7.gff e2718bcb708553147de9096927dccc23 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395041.1_NC_007129.7.asn f9159f5bc4f13df9e81370261d7954f8 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395041.1_NC_007129.7.gff 56d6c327d50909c7370f85a110678949 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395042.1_NC_007129.7.asn 1cbb61a0bc29d549bc47fec6f000c5a4 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395042.1_NC_007129.7.gff 43fe313f031e77461d6ace72c697f5b5 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395043.1_NC_007129.7.asn 1f179ec473e6f382e07cd5a4ed0f37d3 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395043.1_NC_007129.7.gff 99190d7cb0b3e1880e24f6dc51023e31 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018395044.1_NC_007129.7.asn 482703be37901d26
I think we're getting somewhere. topiary assumes an md5 file has rows that have the format "hash file". It looks like this file is truncated (the last line looks like an incomplete hash). I suspect the md5 download terminated early for some reason.
If this is true, you should be able to re-run and successfully complete the job.
I can patch topiary to prevent this in the future by adding a check to make sure the md5 file downloads successfully, rather than cryptically crashing.
Maybe try re-running the job?
Thanks!
A rerun produced a very similar output - I got a 01_initial-dataframe.csv with the same filesize, the blast result XML is almost the same size (a difference of three lines), and md5checksums.txt is again truncated:
3ce0f863975dd40c4ea48c96478d30ed ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394748.1_NC_007120.7.gff 79e2f3921879aaeb9e1514403d22dec8 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394749.1_NC_007120.7.asn 9b2de24652cf7f00d6985fe118f743a4 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394749.1_NC_007120.7.gff 5bccb98a2853b1e9580ccfc54da20b71 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394750.1_NC_007120.7.asn c81f8f223679341488c081011ba742c3 ./GCF_000002035.6_GRCz11_assembly_structure/ALT_DRER_TU_1/alt_scaffolds/alignments/NW_018394750.1_NC_007120.7.gff 680fd33bd2b0
That's strange. I just created a bug fix that downloads the md5sum file, checks if it is sane, then attempts to download it again if it fails. Would you be up for seeing if it fixes your problem? To download the change, you can follow the instructions below:
conda activate topiary
cd the_topiary_directory_wherever_you_downloaded_it
git checkout -b harmsm-main main
git pull [email protected]:harmsm/topiary.git main
python setup.py install
Best,
Mike
Hi, is the information correct? When I tried
git pull @.***:harmsm/topiary.git main
I get the message:
@.***: Permission denied (publickey). fatal: Could not read from remote repository.
Please make sure you have the correct access rights and the repository exists.
Em sex., 24 de fev. de 2023 às 20:43, Mike Harms @.***> escreveu:
That's strange. I just created a bug fix that downloads the md5sum file, checks if it is sane, then attempts to download it again if it fails. Would you be up for seeing if it fixes your problem? To download the change, you can follow the instructions below:
conda activate topiary cd the_topiary_directory_wherever_you_downloaded_it git checkout -b harmsm-main main git pull @.***:harmsm/topiary.git main python setup.py install
Best,
Mike
— Reply to this email directly, view it on GitHub https://github.com/harmslab/topiary/issues/33#issuecomment-1444726034, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADRZJB3Y257SOJN3ECJTLH3WZFBRHANCNFSM6AAAAAAVFS7K6E . You are receiving this because you authored the thread.Message ID: @.***>
I am having the same error, actually. Below is my output.
Polishing alignment and re-aligning.
muscle 5.1.linux64 [] 396Gb RAM, 40 cores Built Feb 24 2022 03:16:15 (C) Copyright 2004-2021 Robert C. Edgar. https://drive5.com
Input: 2 seqs, length avg 392 max 408
00:00 17Mb 50.0% Derep 0 uniques, 0 dupes 00:00 17Mb 100.0% Derep 1 uniques, 0 dupes 00:00 18Mb 50.0% UCLUST 2 seqs EE<0.01, 0 centroids, 0 members 00:00 18Mb 100.0% UCLUST 2 seqs EE<0.01, 1 centroids, 0 members 00:00 18Mb CPU has 40 cores, defaulting to 20 threads 00:00 18Mb 50.0% UCLUST 2 seqs EE<0.30, 0 centroids, 0 members 00:00 18Mb 100.0% UCLUST 2 seqs EE<0.30, 1 centroids, 0 members 00:00 58Mb 100.0% Make cluster MFAs 1 clusters pass 1 1 clusters pass 2 00:00 58Mb 00:00 58Mb Align cluster 1 / 1 (2 seqs) 00:00 58Mb 00:00 58Mb 100.0% Calc posteriors 00:00 58Mb 100.0% UPGMA5 00:00 59Mb 100.0% Consensus sequences
Success. Alignment written to the alignment column in the dataframe.
Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper
value = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/pipeline/seed_to_alignment.py", line 502, in seed_to_alignment
df = topiary.quality.polish_alignment(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/quality/polish.py", line 136, in polish_alignment
top_fx_sparse = _get_cutoff(df.fx_in_sparse,pct=fx_sparse_percentile)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/quality/polish.py", line 43, in _get_cutoff
return x[idx]
~^^^^^
IndexError: index 2 is out of bounds for axis 0 with size 2
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function ret = fcn(**fcn_args.dict) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'seed_to_alignment'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-seed-to-alignment", line 26, in
Function seed_to_alignment raised an error.
To see command line help, run topiary-seed-to-alignment --help
and the last few lines of my md5checksums.txt are:
75f783e620888f6a20c9e7030bf54de2 ./Gnomon_models/GCF_000001405.40_GRCh38.p14_gnomon_model.gff.gz 962674e06f93bd8656cbd860c395f5ad ./Gnomon_models/GCF_000001405.40_GRCh38.p14_gnomon_protein.faa.gz f7262c7cc28373fd2aa0a225ef27a50e ./Gnomon_models/GCF_000001405.40_GRCh38.p14_gnomon_rna.fna.gz 3b7f12ebd3d129698e86fb8701bb9688 ./README_patch_release.txt 84b55637f312368687af5e2b545fcb8d ./RefSeq_transcripts_alignments/GCF_000001405.40_GRCh38.p14_knownrefseq_alns.bam e9e81c03bce9f45f7a4edfe44f0a8f8f ./RefSeq_transcripts_alignments/GCF_000001405.40_GRCh38.p14_knownrefseq_alns.bam.bai d9ff57b0fdb663665f2d0f9305831b30 ./RefSeq_transcripts_alignments/GCF_000001405.40_GRCh38.p14_modelrefseq_alns.bam 96404b41c1c023019a0ba6514d98c498 ./RefSeq_transcripts_alignments/GCF_000001405.40_GRCh38.p14_modelrefseq_alns.bam.bai
Thanks for the report. I just merged the PR I referenced above. I still have not been able to reproduce the error on my end. Can one of you try the command again with the new version? To install the latest version, you could run the following:
cd topiary
git pull origin main
conda activate topiary
python -m pip install . -vv
Thanks! (And thanks for your patience with the delayed response to this thread).
I followed the instructions above, and still ran into the same error. The terminal output is attached: Terminal SavedOutpiut.txt
@jjvanantwerp Thanks for the bug report and sorry for the slow reply. Dangerous having the prof in charge of package maintenance...
I looked through your log file; it appears you're having a different bug. It's crashing when polishing the final alignment. If possible could you please post the last csv file that topiary writes out before the crash occurs? Based on when the crash occurs, I believe this should be 04_aligned-dataframe.csv.
Thanks.
Yes, here it is. 04_aligned-dataframe.csv
Hey, Mike, sorry about the delay, I just had one of those crazy weeks. Here's the error I'm getting:
Downloading Danio rerio proteome Downloading proteome for taxid '7955' Process Process-11: Traceback (most recent call last): File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/ftp.py", line 36, in _ftp_thread ftp.retrbinary(cmd="RETR " + file_name, File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 445, in retrbinary return self.voidresp() ^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 259, in voidresp resp = self.getresp() ^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 244, in getresp resp = self.getmultiline() ^^^^^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 230, in getmultiline line = self.getline() ^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/ftplib.py", line 218, in getline raise EOFError EOFError Traceback (most recent call last): File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/download.py", line 92, in ncbi_ftp_download md5_dict[file_name] ~~~~~~~~^^^^^^^^^^^ KeyError: 'GCF_000002035.6_GRCz11_protein.faa.gz'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/proteome.py", line 217, in get_proteome ncbi_ftp_download(genome_url,file_base="_protein.faa.gz") File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/download.py", line 96, in ncbi_ftp_download raise FileNotFoundError(err) FileNotFoundError: The file 'GCF_000002035.6_GRCz11_protein.faa.gz' is not present on the NCBI. Full path: ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/002/035/GCF_000002035.6_GRCz11//genomes/all/GCF/000/002/035/GCF_000002035.6_GRCz11/GCF_000002035.6_GRCz11_protein.faa.gz
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/pipeline/seed_to_alignment.py", line 406, in seed_to_alignment proteome_list.append(topiary.ncbi.get_proteome(taxid=this_taxid)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/ncbi/entrez/proteome.py", line 241, in get_proteome raise RuntimeError(err) RuntimeError: Could not download proteome GCF_000002035.6_GRCz11_protein.faa.gz. This can happen if an assembly is in the NCBI database but does not have an associated _protein.tar.gz file. If you are running this as part the seed_to_alignment pipeline, you have a couple of options. 1) You can replace the problematic species (taxid = 7955) in your seed dataset and start the pipeline again. 2) You can edit the 01_initial-dataframe.csv file, adding or editing the column 'recip_blast'. Set this to 'FALSE' for every row except the rows with key_species = 'TRUE'. Set this to 'FALSE' for the problematic species. You can then restart the pipeline with the --restart flag. Topiary will not use this species for reciprocal BLAST, but will still treat it as a key species in other respects.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function ret = fcn(**fcn_args.dict) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lucas/miniconda3/envs/topiary/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'seed_to_alignment'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/lucas/miniconda3/envs/topiary/bin/topiary-seed-to-alignment",
line 26, in
Function seed_to_alignment raised an error.
To see command line help, run topiary-seed-to-alignment --help
Em qui., 2 de mar. de 2023 às 16:47, Mike Harms @.***> escreveu:
Thanks for the report. I just merged the PR I referenced above. I still have not been able to reproduce the error on my end. Can one of you try the command again with the new version? To install the latest version, you could run the following:
cd topiary git pull origin main conda activate topiary python -m pip install . -vv
Thanks! (And thanks for your patience with the delayed response to this thread).
— Reply to this email directly, view it on GitHub https://github.com/harmslab/topiary/issues/33#issuecomment-1452456847, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADRZJBZEVUSZHE4GRQLMT2LW2D2LRANCNFSM6AAAAAAVFS7K6E . You are receiving this because you authored the thread.Message ID: @.***>
@jjvanantwerp : Thanks for the file! I am able to reproduce the error and am working on this now.
@lbleicher : thanks for the detailed error message. I'll look into.
@jjvanantwerp Should be fixed now. I just merged a PR with the change. You should be able to run the following to install the latest and greatest version. Thanks for helping troubleshoot!
cd topiary
git pull origin main
conda activate topiary
python -m pip install . -vv
Yes, I was able to progress past the alignment! I think this issue can be closed. Unfortunately, I will need to open another for what appears to be the same error in the next step. I am not sure if here is the best place to discuss that or if I should open a new issue - it's that same place in the wrap function, line 189.
Glad we made progress! The wrap function will always throw an error; it’s a way to capture internal errors and make sure the crashing function returns to the right directory, clean up, etc. Maybe paste the whole error?
Thanks!
Mike
On Mar 14, 2023, at 9:53 PM, James @.***> wrote:
Yes, I was able to progress past the alignment! I think this issue can be closed. Unfortunately, I will need to open another for what appears to be the same error in the next step. I am not sure if here is the best place to discuss that or if I should open a new issue - it's that same place in the wrap function, line 189.
— Reply to this email directly, view it on GitHub https://github.com/harmslab/topiary/issues/33#issuecomment-1469332469, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFZA6R2N3IBQXXOQK4A7W3W4FDLHANCNFSM6AAAAAAVFS7K6E. You are receiving this because you commented.
Terminal Saved Output Mar 15.txt
I have attached the whole terminal session, but below is the relevant part. It says the issue is that my alignment is too small, and I'm not sure if there's a way to address this here or upstream.
(topiary_resolved) [vanant25@dev-intel16 topiary]$ topiary-alignment-to-ancestors ER_Final_Alignment.csv --out_dir ER_ASR --num_threads 1
Non-microbial dataset detected. Gene/species tree reconciliation will be performed
Checking raxml-ng
installed: Y
binary_path: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng
binary runs: Y
version: 1.1
minimum version: 1.1
passes: Y
Checking generax
installed: Y
binary_path: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/generax
binary runs: Y
version: 2.0.4
minimum version: 2.0
passes: Y
Checking mpirun
installed: Y
binary_path: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/mpirun
binary runs: Y
version: 4.1.5
minimum version: 0.0
passes: Y
topiary is starting a find_best_model calculation in ./00_find-model:
Generating maximum parsimony tree.
Launching raxml-ng, 0:00:00.007415 (H:M:S)
topiary ran a find_best_model calculation in ./00_find-model:
- Crashed after 0:00:00.021205 (H:M:S)
- Please check ./00_find-model/working
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 336, in launch raise RuntimeError(err) RuntimeError: ERROR: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng returned 1
/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng output
RAxML-NG v. 1.1 released on 29.11.2021 by The Exelixis Lab. Developed by: Alexey M. Kozlov and Alexandros Stamatakis. Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth. Latest version: https://github.com/amkozlov/raxml-ng Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml
System: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 28 cores, 125 GB RAM
RAxML-NG was called at 15-Mar-2023 00:48:29 as follows:
/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng --start --msa alignment.phy --model LG --seed 3997117630 --threads 1 --tree pars{1}
Analysis options: run mode: Starting tree generation start tree(s): parsimony (1) random seed: 3997117630 SIMD kernels: AVX2 parallelization: coarse-grained (auto), NONE/sequential
[00:00:00] Reading alignment from file: alignment.phy [00:00:00] Loaded alignment with 2 taxa and 410 sites
ERROR: Your alignment contains less than 4 sequences!
ERROR: Alignment check failed (see details above)!
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/_raxml.py", line 189, in run_raxml interface.launch(cmd, File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'launch'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/model.py", line 260, in find_best_model _generate_parsimony_tree(supervisor.alignment, File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/model.py", line 45, in _generate_parsimony_tree run_raxml(run_directory=run_directory, File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/_raxml.py", line 197, in run_raxml raise RuntimeError from e RuntimeError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/pipeline/alignment_to_ancestors.py", line 323, in alignment_to_ancestors topiary.find_best_model(df, File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'find_best_model'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function ret = fcn(**fcn_args.dict) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'alignment_to_ancestors'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-alignment-to-ancestors", line 26, in
Function alignment_to_ancestors raised an error.
To see command line help, run topiary-alignment-to-ancestors --help
(topiary_resolved) [vanant25@dev-intel16 topiary]$
Hi James,
Yep, alignment is too small. This mini-dataset only has a few, nearly identical, sequences that are trimmed out during the quality control step. Maybe try going back upstream, doing seed_to_alignment before feeding into ali_to_anc? That should BLAST and pull many more sequences down for your tree inference.
Best,
Mike
On Mar 14, 2023, at 10:00 PM, James @.***> wrote:
Terminal Saved Output Mar 15.txt https://github.com/harmslab/topiary/files/10976158/Terminal.Saved.Output.Mar.15.txt I have attached the whole terminal session, but below is the relevant part. It says the issue is that my alignment is too small, and I'm not sure if there's a way to address this here or upstream.
(topiary_resolved) @.*** topiary]$ topiary-alignment-to-ancestors ER_Final_Alignment.csv --out_dir ER_ASR --num_threads 1
Non-microbial dataset detected. Gene/species tree reconciliation will be performed
Checking raxml-ng
installed: Y binary_path: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng binary runs: Y version: 1.1 minimum version: 1.1 passes: Y Checking generax
installed: Y binary_path: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/generax binary runs: Y version: 2.0.4 minimum version: 2.0 passes: Y Checking mpirun
installed: Y binary_path: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/mpirun binary runs: Y version: 4.1.5 minimum version: 0.0 passes: Y topiary is starting a find_best_model calculation in ./00_find-model:
Generating maximum parsimony tree.
Launching raxml-ng, 0:00:00.007415 (H:M:S)
topiary ran a find_best_model calculation in ./00_find-model:
Crashed after 0:00:00.021205 (H:M:S) Please check ./00_find-model/working Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 336, in launch raise RuntimeError(err) RuntimeError: ERROR: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng returned 1
/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng output
RAxML-NG v. 1.1 released on 29.11.2021 by The Exelixis Lab. Developed by: Alexey M. Kozlov and Alexandros Stamatakis. Contributors: Diego Darriba, Tomas Flouri, Benoit Morel, Sarah Lutteropp, Ben Bettisworth. Latest version: https://github.com/amkozlov/raxml-ng Questions/problems/suggestions? Please visit: https://groups.google.com/forum/#!forum/raxml
System: Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz, 28 cores, 125 GB RAM
RAxML-NG was called at 15-Mar-2023 00:48:29 as follows:
/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/raxml-ng --start --msa alignment.phy --model LG --seed 3997117630 --threads 1 --tree pars{1}
Analysis options: run mode: Starting tree generation start tree(s): parsimony (1) random seed: 3997117630 SIMD kernels: AVX2 parallelization: coarse-grained (auto), NONE/sequential
[00:00:00] Reading alignment from file: alignment.phy [00:00:00] Loaded alignment with 2 taxa and 410 sites
ERROR: Your alignment contains less than 4 sequences!
ERROR: Alignment check failed (see details above)!
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/_raxml.py", line 189, in run_raxml interface.launch(cmd, File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'launch'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/model.py", line 260, in find_best_model _generate_parsimony_tree(supervisor.alignment, File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/model.py", line 45, in _generate_parsimony_tree run_raxml(run_directory=run_directory, File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/raxml/_raxml.py", line 197, in run_raxml raise RuntimeError from e RuntimeError
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/pipeline/alignment_to_ancestors.py", line 323, in alignment_to_ancestors topiary.find_best_model(df, File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'find_best_model'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function ret = fcn(**fcn_args.dict) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'alignment_to_ancestors'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-alignment-to-ancestors", line 26, in main() File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-alignment-to-ancestors", line 21, in main wrap_function(alignment_to_ancestors, File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 189, in wrap_function raise RuntimeError(err) from e RuntimeError:
Function alignment_to_ancestors raised an error.
To see command line help, run topiary-alignment-to-ancestors --help
(topiary_resolved) @.*** topiary]$
— Reply to this email directly, view it on GitHub https://github.com/harmslab/topiary/issues/33#issuecomment-1469337323, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFZA6TYIZHZMOGZ7IIEVG3W4FEFDANCNFSM6AAAAAAVFS7K6E. You are receiving this because you commented.
The input file was the output of the seed_to_alignment, i thought. I used the 05_clean-aligned-dataframe.csv as the input for ali_to_anc, without any cleaning.
Ah, I think I might understand. Did you only include a one human sequence in there as a seed? If so, topiary is only looking for human/primate sequences because the seed dataset specifies the taxonomic scope as only human. You’ll want to add a sequence from another species that indicates the taxonomic scope to reconstruct (e.g., human-bony fishes, all mammals, etc.). We describe how to think about this here:
https://topiary-asr.readthedocs.io/en/latest/protocol.html#define-the-problem-doc
If that’s not what’s going on, we can definitely keep troubleshooting to find the bug.
Mike
On Mar 14, 2023, at 10:06 PM, James @.***> wrote:
The input file was the output of the seed_to_alignment, i thought. I used the 05_clean-aligned-dataframe.csv as the input for ali_to_anc, without any cleaning.
— Reply to this email directly, view it on GitHub https://github.com/harmslab/topiary/issues/33#issuecomment-1469343265, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFZA6T2O54CFHFA5AHNQ3DW4FE5BANCNFSM6AAAAAAVFS7K6E. You are receiving this because you commented.
No, that's what I did. I was hoping Topiary would 'fill in' around that sequence, but it seems like it's looking for that to be the 'edge' of sequence space instead. I will have to redesign my experiment to incorporate this behavior.
Hopefully it works for you then. 🤞Topiary fills in sequences within the species boundaries defined in the seed data frame. You basically need one more sequence in your seed data frame to start it going.MikeSent from my iPhoneOn Mar 14, 2023, at 22:36, James @.***> wrote: No, that's what I did. I was hoping Topiary would 'fill in' around that sequence, but it seems like it's looking for that to be the 'edge' of sequence space instead. I will have to redesign my experiment to incorporate this behavior.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: @.***>
I've filed out the seed alignment, and ran into an error that I suspect is because of the format of my seed alignment. I have attached the seed alignment. Do you recognize what might cause this:
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/opentree/util.py", line 69, in _validate_ott_or_species raise ValueError(err) ValueError: Could not process ott None. Should be an integer or string with format ottINTEGER
Here is the full error stack:
(topiary_resolved) [vanant25@dev-intel18 topiary]$ topiary-seed-to-alignment ER_Seed.csv --out_dir ER_Align
Checking blastp
installed: Y
binary_path: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/blastp
binary runs: Y
version: 2.13.0+
minimum version: 2.0
passes: Y
Checking makeblastdb
installed: Y
binary_path: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/makeblastdb
binary runs: Y
version: 2.13.0+
minimum version: 2.0
passes: Y
Checking muscle
installed: Y
binary_path: /mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/muscle
binary runs: Y
version: 5.1.linux64
minimum version: 5.0
passes: Y
Building initial topiary dataframe.
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/opentree/util.py", line 65, in _validate_ott_or_species check_ott = int(check_ott) ^^^^^^^^^^^^^^ TypeError: int() argument must be a string, a bytes-like object or a real number, not 'NoneType'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 32, in wrapper value = func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/pipeline/seed_to_alignment.py", line 371, in seed_to_alignment out = topiary.df_from_seed(**kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/io/seed.py", line 312, in df_from_seed seed_df, key_species, paralog_patterns, species_aware = topiary.io.read_seed(seed_df, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/io/seed.py", line 126, in read_seed mrca = topiary.opentree.ott_to_mrca(ott_list=ott_list, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/opentree/util.py", line 426, in ott_to_mrca ott_list = _validate_ott_or_species(ott_list,species_list) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/opentree/util.py", line 69, in _validate_ott_or_species raise ValueError(err) ValueError: Could not process ott None. Should be an integer or string with format ottINTEGER
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/wrap.py", line 185, in wrap_function ret = fcn(**fcn_args.dict) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/lib/python3.11/site-packages/topiary/_private/interface.py", line 38, in wrapper raise WrappedFunctionException(err) from e topiary._private.interface.WrappedFunctionException:
Caught exception in function 'seed_to_alignment'. Returning to starting directory and cleaning up. Check error stack for cause of this error.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/home/vanant25/anaconda3/envs/topiary_resolved/bin/topiary-seed-to-alignment", line 26, in
Function seed_to_alignment raised an error.
To see command line help, run topiary-seed-to-alignment --help
Okay, it should work now. (Or, actually, it should fail now with a useful error). It turns out one of your species, Gulo gulo luscus, is not in the open tree of life database. Topiary was supposed to let you know this was the problem, but was choking on opentreeoflife output. I just pushed a change so it should now do so.
I suspect you want to replace "Gulo gulo luscus" with "Gulo gulo" (https://tree.opentreeoflife.org/taxonomy/browse?id=752563)
Best,
Mike
I changed the species name in the seed alignment, which advanced me further than I have been able to get before. Unfortunately, the alignment hit a critical error again. I have uploaded what I think is the final alignment file that was used.
Terminal Saved Output_Topiary_Error.txt 03_shrunk-dataframe.csv