arctic
arctic copied to clipboard
VersionStore: DataFrame with tzinfo fails to serialize with Pandas >= 0.24
Arctic Version
1.79.2
Arctic Store
VersionStore
Platform and version
pandas >= 0.24
Description of problem and/or code sample that reproduces the issue
from arctic import Arctic
import pandas as pd
a = Arctic("localhost")
a.initialize_library("vstore")
lib = a["vstore"]
df = pd.DataFrame({"a": [1]}, index=[pd.Timestamp.utcnow()])
written = lib.write('test', df)
works fine with pandas 0.23.2, but with pandas >=0.24.0 the tzinfo cannot be serialized for mongodb:
File "/Applications/PyCharm.app/Contents/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "/Users/meatz/tmp/arctic_timezone/bug.py", line 62, in <module>
written = lib.write('test', df)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/arctic/decorators.py", line 49, in f_retry
return f(*args, **kwargs)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/arctic/store/version_store.py", line 664, in write
self._insert_version(version)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/arctic/store/version_store.py", line 529, in _insert_version
mongo_retry(self._versions.insert_one)(version)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/arctic/decorators.py", line 49, in f_retry
return f(*args, **kwargs)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/collection.py", line 700, in insert_one
session=session),
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/collection.py", line 614, in _insert
bypass_doc_val, session)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/collection.py", line 602, in _insert_one
acknowledged, _insert_command, session)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1280, in _retryable_write
return self._retry_with_session(retryable, func, s, None)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/mongo_client.py", line 1233, in _retry_with_session
return func(session, sock_info, retryable)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/collection.py", line 597, in _insert_command
retryable_write=retryable_write)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/pool.py", line 589, in command
self._raise_connection_failure(error)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/pool.py", line 750, in _raise_connection_failure
raise error
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/pool.py", line 584, in command
user_fields=user_fields)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/network.py", line 121, in command
codec_options, ctx=compression_ctx)
File "/Users/meatz/miniconda3/envs/arctic_test/lib/python3.6/site-packages/pymongo/message.py", line 678, in _op_msg
flags, command, identifier, docs, check_keys, opts)
bson.errors.InvalidDocument: cannot encode object: <UTC>, of type: <class 'pytz.UTC'>
Process finished with exit code 1
Any intention to add the necessary 0.25.0 pandas support?
It is private method get_timezone that has changed behaviour since pandas >= 0.24.0 https://github.com/man-group/arctic/blob/master/arctic/serialization/numpy_records.py#L84. This approach of invoking get_timezone has been since initial commit.
import pandas as pd
from pandas._libs.tslibs.timezones import get_timezone
get_timezone(pd.date_range(freq="M", periods=10, start='2020-01-01').tz_localize('UTC').tz)