arctic
arctic copied to clipboard
Persisting time series to VersionStore with timezone fails
Arctic Version
1.75
Arctic Store
VersionStore
Platform and version
Windows 10
Description of problem and/or code sample that reproduces the issue
Can we please reopen #727?
Code snippet to reproduce:
#!/usr/bin/env python
import logging
import arctic
import pymongo
import pandas as pd
import numpy as np
import pytz
import tsg.core.tsArctic as ts_arctic
if __name__ == "__main__":
logging.basicConfig(level=logging.INFO)
logging.info("Arctic version: {}".format(arctic.__version__))
logging.info("Pymongo version: {}".format(pymongo.__version__))
logging.info("Pandas version: {}".format(pd.__version__))
logging.info("NumPy version: {}".format(np.__version__))
logging.info("pytz version: {}".format(pytz.__version__))
cxn = ts_arctic.get_arctic_connection()
utc_idx = pd.date_range('2019-10-16 00:00:00', periods=6, freq='H', tz='UTC')
utc_data = pd.Series([1, 2, 3, 4, 5, 6], index=utc_idx)
cxn['hennip-vs'].write('tz-bug-symbol', utc_data)
Logging and stack trace:
INFO:root:Arctic version: 1.75.0
INFO:root:Pymongo version: 3.7.2
INFO:root:Pandas version: 0.24.2
INFO:root:NumPy version: 1.16.2
INFO:root:pytz version: 2018.7
INFO:arctic.arctic:Connecting to mongo: *REDACTED*
INFO:arctic.arctic:Mongo Quota: arctic.hennip-vs 0.000 / 10 GB used
INFO:arctic.serialization.numpy_records:Index has no name, defaulting to 'index'
INFO:arctic.serialization.numpy_records:Series has no name, defaulting to 'values'
Traceback (most recent call last):
File "arctic_vs_tz_bug.py", line 25, in <module>
cxn['hennip-vs'].write('tz-bug-symbol', utc_data)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\arctic\decorators.py", line 49, in f_retry
return f(*args, **kwargs)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\arctic\store\version_store.py", line 664, in write
self._insert_version(version)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\arctic\store\version_store.py", line 529, in _insert_version
mongo_retry(self._versions.insert_one)(version)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\arctic\decorators.py", line 49, in f_retry
return f(*args, **kwargs)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\collection.py", line 693, in insert_one
session=session),
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\collection.py", line 607, in _insert
bypass_doc_val, session)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\collection.py", line 595, in _insert_one
acknowledged, _insert_command, session)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\mongo_client.py", line 1248, in _retryable_write
return self._retry_with_session(retryable, func, s, None)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\mongo_client.py", line 1201, in _retry_with_session
return func(session, sock_info, retryable)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\collection.py", line 590, in _insert_command
retryable_write=retryable_write)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\pool.py", line 584, in command
self._raise_connection_failure(error)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\pool.py", line 745, in _raise_connection_failure
raise error
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\pool.py", line 579, in command
unacknowledged=unacknowledged)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\network.py", line 114, in command
codec_options, ctx=compression_ctx)
File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\message.py", line 679, in _op_msg
flags, command, identifier, docs, check_keys, opts)
bson.errors.InvalidDocument: Cannot encode object: <UTC>
Also having this problem. It is also important to note in the unit test referred to in #727 it is not testing UTC. the snippet below is the unit test code for fast reference
def test_save_read_pandas_series_with_datetimeindex_with_timezone(library): df = Series(data=['A', 'BC', 'DEF'], index=DatetimeIndex(np.array([dt(2013, 1, 1), dt(2013, 1, 2), dt(2013, 1, 3)]).astype('datetime64[ns]'), tz="America/Chicago")) library.write('pandas', df) saved_df = library.read('pandas').data assert df.index.tz == saved_df.index.tz assert all(df.index == saved_df.index)
The error I am having is similar to the above. when using
data.index = pd.to_datetime(data.index, format="%Y-%m-%dT%H:%M:%S.%f", utc=True) arctic_library.write(symbol, data, prune_previous_version=True)
bson.errors.InvalidDocument: cannot encode object: <UTC>, of type: <class 'pytz.UTC'>
@derekwong9 Do you know if there is a work around for using a UTC datetime? Im getting the same error with
cannot encode object: < UTC>, of type: <class 'pytz.UTC'>
I just convert to UTC and then strip it of the timezone and store it. Then when i read it apply utc time zone back onto it. Basically a simple wrapper around the read and write functions. There hasnt been an easier way that I know of.
On Wed, Feb 12, 2020, 21:21 Joseph Bastulli [email protected] wrote:
@derekwong9 https://github.com/derekwong9 Do you know if there is a work around for using a UTC datetime? Im getting the same error with
cannot encode object: < UTC>, of type: <class 'pytz.UTC'>
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/man-group/arctic/issues/822?email_source=notifications&email_token=ABPH3ZCLHMX44BETPIPSL53RCPZXJA5CNFSM4JBQDQHKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELQXT7A#issuecomment-585202172, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPH3ZH42ZMEO5KQZWUBAJLRCPZXJANCNFSM4JBQDQHA .
@derekwong9 Thanks I used
date = pd.to_datetime(tickdict['data'][2], unit='ns')
date.replace(tzinfo=None)
Works perfectly now!