Scoary
Scoary copied to clipboard
IndexError: list index out of range
Hi, I'm new to using scoary and am running into an issue. Here is the full error that scoary gives me:
Traceback (most recent call last): File "/home/hcm59/miniconda3/envs/scoary/bin/scoary", line 8, in <module> sys.exit(main()) File "/home/hcm59/miniconda3/envs/scoary/lib/python3.9/site-packages/scoary/methods.py", line 278, in main RES_and_GTC = Setup_results(genedic, traitsdic, args.collapse) File "/home/hcm59/miniconda3/envs/scoary/lib/python3.9/site-packages/scoary/methods.py", line 914, in Setup_results bh_c_p_v[s_p_v[len(s_p_v)-1][0]] = last_bh = s_p_v[len(s_p_v)-1][1] IndexError: list index out of range
It seems to be working prior to this, but stops here and doesn't give any output files. I looked in the methods.py script but couldn't find anything obviously wrong. My data are output from Roary, a phenotype file, both delimited with commas, and a Newick tree file from IQTree.
I found a previous issue that was similar (https://github.com/AdmiralenOla/Scoary/issues/23) but it looks like their problem was that their Roary file was delimited with semicolons, but I'm 99% sure mine is commas.
Any help is appreciated! I can send example files too.
Here's the script I used:
scoary -t /path/dog_verified_host_PhenoForScoary.csv \ -g /path/gene_presence_absence_roary.csv \ -o /path \ -n /path/core_gene_alignment.aln-gb.nw \ --delimiter , \ --permute 1000 --threads 10
I'm using scoary in a conda environment that I built on a Linux server. Here are some specifications:
# packages in environment at /home/hcm59/miniconda3/envs/scoary:
#
# Name Version Build Channel
_libgcc_mutex 0.1 main
argparse 1.4.0 pypi_0 pypi
ca-certificates 2021.4.13 h06a4308_1
certifi 2020.12.5 py39h06a4308_0
ete3 3.1.2 pypi_0 pypi
ld_impl_linux-64 2.33.1 h53a641e_7
libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
ncurses 6.2 he6710b0_1
numpy 1.20.2 pypi_0 pypi
openssl 1.1.1k h27cfd23_0
pip 21.0.1 py39h06a4308_0
python 3.9.2 hdb3f193_0
readline 8.1 h27cfd23_0
scipy 1.6.2 pypi_0 pypi
scoary 1.6.16 pypi_0 pypi
setuptools 52.0.0 py39h06a4308_0
six 1.15.0 py39h06a4308_0
sqlite 3.35.4 hdfb4753_0
tk 8.6.10 hbc83047_0
tzdata 2020f h52ac0ba_0
wheel 0.36.2 pyhd3eb1b0_0
xz 5.2.5 h7b6447c_0
zlib 1.2.11 h7b6447c_3
Thanks!! -Holly
Update: just found out we had used Panaroo, not Roary, so I will be looking into this and seeing if I can find a solution!!
Answering my own question as almost a year later I ran into the same error and found my own question (ha!)
Basically the issue is that we only had one value for a particular trait, so Scoary was like "I can't correct for multiple tests since there's only one"
word to the wise: remove any traits that have less than 2 (I guess? tbd) values
I have the same error as @hollygene when using the -n option but I am not sure why that is the case @mgalardini @AdmiralenOla
I have the same issue. Is there any solution?
Traceback (most recent call last):
File "/ibex/scratch/projects/c2078/conda/mambaforge/envs/scoary/bin/scoary", line 8, in <module>
sys.exit(main())
File "/ibex/scratch/projects/c2078/conda/mambaforge/envs/scoary/lib/python3.6/site-packages/scoary/methods.py", line 301, in main
delimiter=args.delimiter)
File "/ibex/scratch/projects/c2078/conda/mambaforge/envs/scoary/lib/python3.6/site-packages/scoary/methods.py", line 1001, in StoreResults
extracolstoprint, firstcolnames, time, delimiter)
File "/ibex/scratch/projects/c2078/conda/mambaforge/envs/scoary/lib/python3.6/site-packages/scoary/methods.py", line 1070, in StoreTraitResult
upgmatree = PruneForMissing(upgmatree, Prunedic[Traitname])
File "/ibex/scratch/projects/c2078/conda/mambaforge/envs/scoary/lib/python3.6/site-packages/scoary/methods.py", line 723, in PruneForMissing
tree[0] = PruneForMissing(tree[0], Prunedic)
File "/ibex/scratch/projects/c2078/conda/mambaforge/envs/scoary/lib/python3.6/site-packages/scoary/methods.py", line 725, in PruneForMissing
if isinstance(tree[1], list):
IndexError: list index out of range
@sydelstan sydelstan and @arunprasanna83 arunprasanna83
Does your data have any phenotype that contains only 1 value? see my above comment about multiple test correction. I think if you ensure that each phenotype category has more than 1 data point, it should be okay. If there's still issues, it might be something else. I'd maybe try running a dummy file with only phenotypes that contain 5+ data points and see if that one works.
Hope this helps/makes sense!