prerequisites to use DataRange to read a subset of data?
Arctic Version
1.79.3
Arctic Store
Version Store
Platform and version
macOS Catalina 10.15.6 (19G73)
Description of problem and/or code sample that reproduces the issue
What are the prerequisites to use DataRange to read a subset of data? From the documentation, it clear that the index must have a datetime index present(Multiindex is supported). After some tryout and reading of the source code, I think there are other requirements, such as:
- The Datetime index should be sorted. ~~2. The start and end of DateRange should be present in the datetime index or None.~~ This is not true. Key error only occurs when the Datetime index is not sorted.
Could you confirm that my understandings are correct?
Hi @qiuwei Confirmed that is the case - with Version store, if you store an unsorted dataframe, the DataRange subset doesn't work when reading out the data. Have you considered using ChunkStore instead ? It will sort your data when calling ChunkStore.write, therefore the chunk_range should always find the correct subset of the data
from arctic import CHUNK_STORE, Arctic
dev = Arctic(mongo_host='localhost')
dev.initialize_library('chunkstore', lib_type=CHUNK_STORE)
lib = dev['chunkstore']
df = pd.DataFrame({'date': [pd.Timestamp('20220131'), pd.Timestamp('20220120')], 'values': [1,2]}).set_index(['date'])
lib.write('test_df', df)
lib.read('test_df')
Out[56]:
values
date
2022-01-20 2
2022-01-31 1