ArcticDB
ArcticDB copied to clipboard
Using sort_and_finalize_staged_data with append can create unordered index
Describe the bug
When sort_and_finalize_staged_data
is used with mode=StagedDataFinalizeMethod.APPEND
it allows to append index such that the overall index becomes out of order. For example it's possible for the appended index values to be less than the last index value in storage.
Steps to reproduce
import pandas as pd
import numpy as np
import arcticdb as adb
from arcticdb.version_store.library import StagedDataFinalizeMethod
ac = adb.Arctic("lmdb://test")
lib = ac.get_library("test", create_if_missing=True)
initial_df = pd.DataFrame({"col": [1, 3]}, index=pd.DatetimeIndex([np.datetime64('2023-01-01'), np.datetime64('2023-01-03')], dtype="datetime64[ns]"))
lib.write("sym", initial_df)
df1 = pd.DataFrame({"col": [2]}, index=pd.DatetimeIndex([np.datetime64('2023-01-02')], dtype="datetime64[ns]"))
lib.write("sym", df1, staged=True)
lib.sort_and_finalize_staged_data("sym", mode=StagedDataFinalizeMethod.APPEND)
print(lib.read("sym"))
Output:
col
2023-01-01 1
2023-01-03 3
2023-01-02 2
Expected behavior
It should behave just like Library.append
and throw an exception.
OS, Python, Arctic versions
Python: 3.10.11 (tags/v3.10.11:7d4cc5a, Apr 5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
OS: Windows-10-10.0.22631-SP0
ArcticDB: 4.5.0rc1