DendroPy
DendroPy copied to clipboard
Maximum recursion depth error
Hi @jeetsukumaran,
Before I continue reporting on this error, just want to say a big thanks for putting out this library. It's come in really useful for a few papers we've written.
For a project that we're working on in which we are analyzing a large phylogenetic tree dataset, we are getting a RecursionError: maximum recursion depth exceeded. Not sure where this is coming from, but will do my best to illustrate what's going on here.
First off, the traceback looks like this:
Traceback (most recent call last):
File "isolate_pds.py", line 14, in <module>
t = Tree.get(path=tree_filename, schema='nexus')
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/datamodel/treemodel.py", line 2730, in get
return cls._get_from(**kwargs)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/datamodel/basemodel.py", line 155, in _get_from
return cls.get_from_path(src=src, schema=schema, **kwargs)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/datamodel/basemodel.py", line 218, in get_from_path
**kwargs)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/datamodel/treemodel.py", line 2635, in _parse_and_create_from_stream
global_annotations_target=None)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/dataio/ioservice.py", line 362, in read_tree_lists
global_annotations_target=global_annotations_target)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/dataio/nexusreader.py", line 362, in _read
self._parse_nexus_stream(stream)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/dataio/nexusreader.py", line 567, in _parse_nexus_stream
self._parse_trees_block()
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/dataio/nexusreader.py", line 1115, in _parse_trees_block
taxon_symbol_mapper=taxon_symbol_mapper)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/dataio/nexusreader.py", line 995, in _parse_tree_statement
tree = self._build_tree_from_newick_tree_string(tree_factory, taxon_symbol_mapper)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/dataio/nexusreader.py", line 1017, in _build_tree_from_newick_tree_string
taxon_symbol_map_fn=taxon_symbol_mapper.require_taxon_for_symbol)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/dataio/newickreader.py", line 374, in _parse_tree_statement
is_internal_node=None)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/dataio/newickreader.py", line 550, in _parse_tree_node_description
is_internal_node=is_new_internal_node,
The last line gets repeated many, many times, until we get the final lines:
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/dataio/newickreader.py", line 541, in _parse_tree_node_description
new_node = tree.node_factory();
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/datamodel/treemodel.py", line 2999, in node_factory
return Node(**kwargs)
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/datamodel/treemodel.py", line 1002, in __init__
length=kwargs.pop("edge_length", None))
File "/home/ericmjl/anaconda/envs/h5n2/lib/python3.5/site-packages/dendropy/datamodel/treemodel.py", line 746, in __init__
basemodel.DataObject.__init__(self, label=kwargs.pop("label", None))
RecursionError: maximum recursion depth exceeded
My computing environment is as such:
$ conda list
# packages in environment at /home/ericmjl/anaconda/envs/h5n2:
#
cycler 0.10.0 py35_0 defaults
dendropy 4.1.0 <pip>
fontconfig 2.11.1 5 defaults
freetype 2.5.5 0 defaults
icu 56.1 2 conda-forge
libiconv 1.14 1 conda-forge
libpng 1.6.17 0 defaults
libxml2 2.9.3 6 conda-forge
matplotlib 1.5.1 np111py35_0 defaults
mkl 11.3.1 0 defaults
numpy 1.11.0 py35_0 defaults
openssl 1.0.2h 0 defaults
pandas 0.18.1 np111py35_0 defaults
pip 8.1.1 py35_1 defaults
pyparsing 2.1.1 py35_0 defaults
pyqt 4.11.4 py35_1 defaults
python 3.5.1 0 defaults
python-dateutil 2.5.3 py35_0 defaults
pytz 2016.4 py35_0 defaults
qt 4.8.7 1 defaults
readline 6.2 2 defaults
setuptools 20.7.0 py35_0 defaults
sip 4.16.9 py35_0 defaults
six 1.10.0 py35_0 defaults
sqlite 3.11.1 0 conda-forge
tk 8.5.18 0 defaults
tqdm 4.5.0 <pip>
wheel 0.29.0 py35_0 defaults
xz 5.0.5 1 defaults
zlib 1.2.8 0 defaults
Would this be an issue with the input tree file? We are happy to provide a copy of the Nexus tree, as well as the script, for further testing.
Hi, We probably need to write that reader to not use recursion to handle huge trees. Most trees have are balanced enough that (even if they are huge) they don't hit Python's recursion limit.
Until we do that... a workaround should be to make a call to the sys module's method to reset the recursion limit. You'll need to do this in the first few lines of the program, I think.
import sys
sys.setrecursionlimit(1500)
or some other large number. That will increase the memory footprint of the process.
If that does not work, then it might indeed be a problem with the tree string. If you want to email me the tree, I can try it out with other tools.
@mtholder: thanks for the suggestion! I'm going to try experimenting with increased recursion limits to see what works for our use case.
@mtholder: increasing the recursion limit worked! Thank you for the tip.
Meanwhile, I will check in with my colleagues to see if I can share the trees with you - assuming you'd still like to have one large tree on hand to use?
Hi Eric,
Glad that it worked out. Thanks for the offer for the large tree. It would probably be more efficient, perhaps, to communicate the size of the tree (i.e., how many tips) rather than the tree itself? This way, we can generate an arbitrary number of arbitrarily-large trees of the appropriate size, and under different shape regimes, when we begin developing a non-recursion algorithm to parse trees. Either way, we appreciate it!
On 5/15/16 4:50 PM, Eric Ma wrote:
@mtholder https://github.com/mtholder: increasing the recursion limit worked! Thank you for the tip.
Meanwhile, I will check in with my colleagues to see if I can share the trees with you - assuming you'd still like to have one large tree on hand to use?
— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/jeetsukumaran/DendroPy/issues/52#issuecomment-219309723
Jeet Sukumaran
[email protected]
Blog/Personal Pages: http://jeetworks.org/ GitHub Repositories: http://github.com/jeetsukumaran Photographs (as stream): http://www.flickr.com/photos/jeetsukumaran/ Photographs (by galleries):
http://www.flickr.com/photos/jeetsukumaran/sets/
Hi @jeetsukumaran, thanks for helping out! Our trees were on the order of ~50,000 taxa. I was doing patristic distance computations on them, figuring out which were the phylogenetically closest taxa to some taxa of interest - and that's where DendroPy came in really handy. Thanks for writing this library! 😄
No problem.
Just out of curiosity, were you using the PhylogeneticDistanceMatrix class or your own custom calculations specifically for the taxa of interest? If the former, what were the performance times?
We wrote custom calculations, because it was taking a long time to compute all pairwise patristic distances (~more than a few hours, and I never let the computation finish).
Ok, thanks. The PDM class could probably be optimized somewhat, but it is still going to do all the pairwise distances, so custom calcs are the way to go if you just need a few comparisons.