arctic icon indicating copy to clipboard operation
arctic copied to clipboard

Persisting time series to VersionStore with timezone fails

Open PatternMatching opened this issue 5 years ago • 4 comments

Arctic Version

1.75

Arctic Store

VersionStore

Platform and version

Windows 10

Description of problem and/or code sample that reproduces the issue

Can we please reopen #727?

Code snippet to reproduce:

#!/usr/bin/env python

import logging

import arctic
import pymongo
import pandas as pd
import numpy as np
import pytz

import tsg.core.tsArctic as ts_arctic


if __name__ == "__main__":
    logging.basicConfig(level=logging.INFO)
    logging.info("Arctic version: {}".format(arctic.__version__))
    logging.info("Pymongo version: {}".format(pymongo.__version__))
    logging.info("Pandas version: {}".format(pd.__version__))
    logging.info("NumPy version: {}".format(np.__version__))
    logging.info("pytz version: {}".format(pytz.__version__))

    cxn = ts_arctic.get_arctic_connection()
    utc_idx = pd.date_range('2019-10-16 00:00:00', periods=6, freq='H', tz='UTC')
    utc_data = pd.Series([1, 2, 3, 4, 5, 6], index=utc_idx)
    cxn['hennip-vs'].write('tz-bug-symbol', utc_data)

Logging and stack trace:

INFO:root:Arctic version: 1.75.0
INFO:root:Pymongo version: 3.7.2
INFO:root:Pandas version: 0.24.2
INFO:root:NumPy version: 1.16.2
INFO:root:pytz version: 2018.7
INFO:arctic.arctic:Connecting to mongo: *REDACTED*
INFO:arctic.arctic:Mongo Quota: arctic.hennip-vs 0.000 / 10 GB used
INFO:arctic.serialization.numpy_records:Index has no name, defaulting to 'index'
INFO:arctic.serialization.numpy_records:Series has no name, defaulting to 'values'
Traceback (most recent call last):
  File "arctic_vs_tz_bug.py", line 25, in <module>
    cxn['hennip-vs'].write('tz-bug-symbol', utc_data)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\arctic\decorators.py", line 49, in f_retry
    return f(*args, **kwargs)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\arctic\store\version_store.py", line 664, in write
    self._insert_version(version)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\arctic\store\version_store.py", line 529, in _insert_version
    mongo_retry(self._versions.insert_one)(version)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\arctic\decorators.py", line 49, in f_retry
    return f(*args, **kwargs)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\collection.py", line 693, in insert_one
    session=session),
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\collection.py", line 607, in _insert
    bypass_doc_val, session)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\collection.py", line 595, in _insert_one
    acknowledged, _insert_command, session)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\mongo_client.py", line 1248, in _retryable_write
    return self._retry_with_session(retryable, func, s, None)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\mongo_client.py", line 1201, in _retry_with_session
    return func(session, sock_info, retryable)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\collection.py", line 590, in _insert_command
    retryable_write=retryable_write)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\pool.py", line 584, in command
    self._raise_connection_failure(error)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\pool.py", line 745, in _raise_connection_failure
    raise error
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\pool.py", line 579, in command
    unacknowledged=unacknowledged)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\network.py", line 114, in command
    codec_options, ctx=compression_ctx)
  File "C:\miniconda\envs\ts-venv-prod\lib\site-packages\pymongo\message.py", line 679, in _op_msg
    flags, command, identifier, docs, check_keys, opts)
bson.errors.InvalidDocument: Cannot encode object: <UTC>

PatternMatching avatar Oct 16 '19 19:10 PatternMatching

Also having this problem. It is also important to note in the unit test referred to in #727 it is not testing UTC. the snippet below is the unit test code for fast reference

def test_save_read_pandas_series_with_datetimeindex_with_timezone(library): df = Series(data=['A', 'BC', 'DEF'], index=DatetimeIndex(np.array([dt(2013, 1, 1), dt(2013, 1, 2), dt(2013, 1, 3)]).astype('datetime64[ns]'), tz="America/Chicago")) library.write('pandas', df) saved_df = library.read('pandas').data assert df.index.tz == saved_df.index.tz assert all(df.index == saved_df.index)

The error I am having is similar to the above. when using

data.index = pd.to_datetime(data.index, format="%Y-%m-%dT%H:%M:%S.%f", utc=True) arctic_library.write(symbol, data, prune_previous_version=True)

bson.errors.InvalidDocument: cannot encode object: <UTC>, of type: <class 'pytz.UTC'>

derekwong9 avatar Nov 06 '19 02:11 derekwong9

@derekwong9 Do you know if there is a work around for using a UTC datetime? Im getting the same error with

cannot encode object: < UTC>, of type: <class 'pytz.UTC'>

bastulli avatar Feb 12 '20 13:02 bastulli

I just convert to UTC and then strip it of the timezone and store it. Then when i read it apply utc time zone back onto it. Basically a simple wrapper around the read and write functions. There hasnt been an easier way that I know of.

On Wed, Feb 12, 2020, 21:21 Joseph Bastulli [email protected] wrote:

@derekwong9 https://github.com/derekwong9 Do you know if there is a work around for using a UTC datetime? Im getting the same error with

cannot encode object: < UTC>, of type: <class 'pytz.UTC'>

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/man-group/arctic/issues/822?email_source=notifications&email_token=ABPH3ZCLHMX44BETPIPSL53RCPZXJA5CNFSM4JBQDQHKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELQXT7A#issuecomment-585202172, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPH3ZH42ZMEO5KQZWUBAJLRCPZXJANCNFSM4JBQDQHA .

derekwong9 avatar Feb 12 '20 14:02 derekwong9

@derekwong9 Thanks I used date = pd.to_datetime(tickdict['data'][2], unit='ns') date.replace(tzinfo=None)

Works perfectly now!

bastulli avatar Feb 12 '20 15:02 bastulli