augur
augur copied to clipboard
Augur refine produces a good tree file but a part built json file and throws error on finishing processes!
Current Behavior
Augur refine produces a good tree file but a part built json file and throws error on finishing processes (see below)?
Expected behavior
Produce outputs but not throw error.
How to reproduce
Steps to reproduce the current behavior:
- Running phylogenetic analyses o WGSf Enterovirus D68 samples. Unfortunately of the samples are unpublished and subject to privacy issues. The rest are scrapped from https://www.ncbi.nlm.nih.gov/labs/virus/vssi/#/virus?SeqType_s=Nucleotide
- Did some cleaning and combining of data set fasta and metadata in a jupyter notebook.
- Runs fine:
%%bash augur align --nthreads auto --sequences data/WGS_to_align.fasta --reference-name AY426531.1 --output data/alignment.fasta
- Runs fine:
%%bash augur tree --substitution-model auto --nthreads auto --alignment data/alignment.fasta --output data/tree.subs.nwk
- Produces output but throws error:
`%%bash
augur refine --alignment data/alignment.fasta --tree data/tree.subs.nwk --metadata data/full_metadata.csv --timetree --divergence-units mutations --output-tree data/tree.time.nwk --output-node-data data/refine.node.json --keep-root`
Error trace:
`augur refine is using TreeTime version 0.10.0
29.06 ###TreeTime.run: INITIAL ROUND
69.55 ###TreeTime.run: ITERATION 1 out of 2 iterations
69.57 DEPRECATION WARNING. TreeTime.resolve_polytomies: You are resolving
polytomies using the old 'greedy' mode. This is not well suited for large
polytomies. Stochastic resolution will become the default in future
versions. To switch now, rerun with the flag `--stochastic-resolve`. To
keep using the greedy method in the future, run with --`greedy-resolve`
129.15 ###TreeTime.run: ITERATION 2 out of 2 iterations
Inferred a time resolved phylogeny using TreeTime:
Sagulenko et al. TreeTime: Maximum-likelihood phylodynamic analysis
Virus Evolution, vol 4, https://academic.oup.com/ve/article/4/1/vex042/4794731
updated tree written to data[/tree.time.nwk](https://file+.vscode-resource.vscode-cdn.net/tree.time.nwk)
Traceback (most recent call last):
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/__init__.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/__init__.py)", line 66, in run
return args.__command__.run(args)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/refine.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/refine.py)", line 332, in run
write_json(node_data, node_data_fname)
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/utils.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/utils.py)", line 130, in write_json
json.dump(data, handle, indent=indent, sort_keys=sort_keys, cls=NumpyJSONEncoder)
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/__init__.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/__init__.py)", line 179, in dump
for chunk in iterable:
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py)", line 432, in _iterencode
yield from _iterencode_dict(o, _current_indent_level)
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py)", line 406, in _iterencode_dict
yield from chunks
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py)", line 406, in _iterencode_dict
yield from chunks
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py)", line 406, in _iterencode_dict
yield from chunks
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py)", line 439, in _iterencode
o = _default(o)
^^^^^^^^^^^
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/utils.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/utils.py)", line 142, in default
return super().default(obj)
^^^^^^^^^^^^^^^^^^^^
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py)", line 180, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type Series is not JSON serializable
An error occurred (see above) that has not been properly handled by Augur.
To report this, please open a new issue including the original command and the error above:
<[https://github.com/nextstrain/augur/issues/new/choose>](https://github.com/nextstrain/augur/issues/new/choose%3E)
---------------------------------------------------------------------------
CalledProcessError Traceback (most recent call last)
Cell In[30], line 1
----> 1 get_ipython().run_cell_magic('bash', '', 'augur refine --alignment data[/alignment.fasta](https://file+.vscode-resource.vscode-cdn.net/alignment.fasta) --tree data[/tree.subs.nwk](https://file+.vscode-resource.vscode-cdn.net/tree.subs.nwk) --metadata data[/full_metadata.csv](https://file+.vscode-resource.vscode-cdn.net/full_metadata.csv) --timetree --divergence-units mutations --output-tree data[/tree.time.nwk](https://file+.vscode-resource.vscode-cdn.net/tree.time.nwk) --output-node-data data[/refine.node.json](https://file+.vscode-resource.vscode-cdn.net/refine.node.json) --keep-root\n')
File [/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/IPython/core/interactiveshell.py:2478](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/IPython/core/interactiveshell.py:2478), in InteractiveShell.run_cell_magic(self, magic_name, line, cell)
2476 with self.builtin_trap:
2477 args = (magic_arg_s, cell)
-> 2478 result = fn(*args, **kwargs)
2480 # The code below prevents the output from being displayed
2481 # when using magics with decodator @output_can_be_silenced
2482 # when the last Python token in the expression is a ';'.
2483 if getattr(fn, magic.MAGIC_OUTPUT_CAN_BE_SILENCED, False):
File [/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/IPython/core/magics/script.py:154](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/IPython/core/magics/script.py:154), in ScriptMagics._make_script_magic..named_script_magic(line, cell)
152 else:
153 line = script
--> 154 return self.shebang(line, cell)
File [/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/IPython/core/magics/script.py:314](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/IPython/core/magics/script.py:314), in ScriptMagics.shebang(self, line, cell)
309 if args.raise_error and p.returncode != 0:
310 # If we get here and p.returncode is still None, we must have
311 # killed it but not yet seen its return code. We don't wait for it,
312 # in case it's stuck in uninterruptible sleep. -9 = SIGKILL
313 rc = p.returncode or -9
--> 314 raise CalledProcessError(rc, cell)
CalledProcessError: Command 'b'augur refine --alignment data[/alignment.fasta](https://file+.vscode-resource.vscode-cdn.net/alignment.fasta) --tree data[/tree.subs.nwk](https://file+.vscode-resource.vscode-cdn.net/tree.subs.nwk) --metadata data[/full_metadata.csv](https://file+.vscode-resource.vscode-cdn.net/full_metadata.csv) --timetree --divergence-units mutations --output-tree data[/tree.time.nwk](https://file+.vscode-resource.vscode-cdn.net/tree.time.nwk) --output-node-data data[/refine.node.json](https://file+.vscode-resource.vscode-cdn.net/refine.node.json) --keep-root\n'' returned non-zero exit status 2.
Note the Json file output is only partly built:
{
"alignment": "data/alignment.fasta",
"clock": {
"intercept": -5.886227443907411,
"rate": 0.0030026231912939766,
"rtt_Tmrca": 1960.3616800717339
},
"generated_by": {
"program": "augur",
"version": "22.0.0"
},
"input_tree": "data/tree.subs.nwk",
"nodes": {
"AY426531.1": {
"branch_length": 5,
"clock_length": 0.0007139060687499921,
"date": "1962-01-01",
"mutation_length": 5,
"numdate": 1962.0013698630137,
"raw_date": "1962-01-01"
},
"JX070222.1": {
"branch_length": 28,
"clock_length": 0.005060543677262781,
"date": "2010-06-23",
"mutation_length": 28,
"numdate": 2010.4753424657533,
"raw_date":
Versions of software running on VM:
- Linux Operating system: NAME="Pop!_OS" VERSION="20.04 LTS" ID=pop ID_LIKE="ubuntu debian" PRETTY_NAME="Pop!_OS 20.04 LTS" VERSION_ID="20.04" HOME_URL="https://pop.system76.com" SUPPORT_URL="https://support.system76.com" BUG_REPORT_URL="https://github.com/pop-os/pop/issues" PRIVACY_POLICY_URL="https://system76.com/privacy" VERSION_CODENAME=focal UBUNTU_CODENAME=focal LOGO=distributor-logo-pop-os
- Augur version 22.0.0
- Mafft version 7.505 (2022/Apr/10)
- IQtree version 2.0.3 for Linux 64-bit built Dec 20 2020
- Treetime version 0.10.0
- Jupyter: IPython : 8.13.1 ipykernel : 6.19.2 ipywidgets : not installed jupyter_client : 8.2.0 jupyter_core : 5.3.0 jupyter_server : 1.23.4 jupyterlab : 3.5.3 nbclient : 0.7.4 nbconvert : 6.5.4 nbformat : 5.8.0 notebook : 6.5.4 qtconsole : 5.4.2 traitlets : 5.9.0
Hi @m-d-grunnill,
This looks like a bug indeed. The relevant error is:
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/utils.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/site-packages/augur/utils.py)", line 142, in default
return super().default(obj)
^^^^^^^^^^^^^^^^^^^^
File "[/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py](https://file+.vscode-resource.vscode-cdn.net/NetDrive/Labs/Duvvuri/Martins_VM_Programs/anaconda3/envs/omics/lib/python3.11/json/encoder.py)", line 180, in default
raise TypeError(f'Object of type {o.class.name} '
TypeError: Object of type Series is not JSON serializable
An error occurred (see above) that has not been properly handled by Augur.
To report this, please open a new issue including the original command and the error above:
<[https://github.com/nextstrain/augur/issues/new/choose>](https://github.com/nextstrain/augur/issues/new/choose%3E)
I noticed you're using Python 3.11, which we don't exclude support from but also haven't done much testing on. This issue could be due to that Python version and/or the recent TreeTime update included with Augur 22.0.0.
A few things:
- Would it be possible for you to share your input files so I can try to reproduce this locally?
- Do you know if this worked for versions earlier than 22.0.0?
- If this issue is due to new versions, potential workarounds include:
- Using Python 3.10 or earlier
- Using Augur 21.1.0 or earlier
I have removed sensitive information and sequences. The compressed folder contains a notebook that produces the error.
I had the same issue with Augur 21.?.? thought it might be fixed if I upgraded to 22.0.0, to no avail.
@m-d-grunnill thanks for sending that. The notebook is taking a long time on the IQ-TREE step, but I was able to reproduce the same error in other ways and created the PR above as a fix.
Dear @victorlin Is there a possible work around I could employ. I noticed your pull request has passed what I think are unit test (little knowledge in the area) but has some conflict with other branches. If I fetch/download your pull request version of the code, could I install that version and use it as a temporary fix. Then replace that installed pull request version with a later version of augur when it is released with the bug fix at a later date.
@rneher I'm just tagging you here as I know you've made some changes to Treetime recently, and was wondering if these may be related? My recent EV-D68 runs have been using treetime 0.8.6 and it has worked without error (I realise this leaves plenty of room in other areas for issues, but thought it might be worth adding).
If anybody else is waiting on the fix https://github.com/nextstrain/augur/pull/1213 to pass review and be included in a release, a temporary solution can be found by downgrading augur to 14.0.0 and treetime to 0.8.6 (see https://github.com/nextstrain/enterovirus_d68/issues/1 ).
@m-d-grunnill sorry for the delay here, I was out on vacation! I've just merged the fix and will release it as version 22.0.3 now.
@m-d-grunnill I did a deeper investigation of the code today and am now thinking this error may be due to something unexpected with the user data, not augur refine
.
I tried running your provided example but was unsuccessful since the zip is missing the file data/full_metadata.csv
.
If possible, could you send a version of data/full_metadata.csv
with sensitive rows/columns removed? I'm hoping the error is still reproducible so I can see whether it is a problem with user data or augur refine
.
@m-d-grunnill for reproduction purposes it'd be great to have all the inputs to the augur refine
call:
-
data/alignment.fasta
-
data/tree.subs.nwk
-
data/full_metadata.csv
augur refine \
--alignment data/alignment.fasta \
--tree data/tree.subs.nwk \
--metadata data/full_metadata.csv \
--timetree \
--divergence-units mutations \
--output-tree data/tree.time.nwk \
--output-node-data data/refine.node.json \
--keep-root
If there's any sensitive stuff, you can try to cut out columns/rows, or replace with dummies, as long as you can verify that you still get the error with the redacted data. Alternatively, you can send us the data via email.