ArcticDB icon indicating copy to clipboard operation
ArcticDB copied to clipboard

Using sort_and_finalize_staged_data with append can create unordered index

Open vasil-pashov opened this issue 6 months ago • 0 comments

Describe the bug When sort_and_finalize_staged_data is used with mode=StagedDataFinalizeMethod.APPEND it allows to append index such that the overall index becomes out of order. For example it's possible for the appended index values to be less than the last index value in storage.

Steps to reproduce

import pandas as pd
import numpy as np
import arcticdb as adb
from arcticdb.version_store.library import StagedDataFinalizeMethod

ac = adb.Arctic("lmdb://test")
lib = ac.get_library("test", create_if_missing=True)
initial_df = pd.DataFrame({"col": [1, 3]}, index=pd.DatetimeIndex([np.datetime64('2023-01-01'), np.datetime64('2023-01-03')], dtype="datetime64[ns]"))
lib.write("sym", initial_df)
df1 = pd.DataFrame({"col": [2]}, index=pd.DatetimeIndex([np.datetime64('2023-01-02')], dtype="datetime64[ns]"))
lib.write("sym", df1, staged=True)
lib.sort_and_finalize_staged_data("sym", mode=StagedDataFinalizeMethod.APPEND)
print(lib.read("sym"))

Output:

            col
2023-01-01    1
2023-01-03    3
2023-01-02    2

Expected behavior It should behave just like Library.append and throw an exception.

OS, Python, Arctic versions

Python: 3.10.11 (tags/v3.10.11:7d4cc5a, Apr  5 2023, 00:38:17) [MSC v.1929 64 bit (AMD64)]
OS: Windows-10-10.0.22631-SP0
ArcticDB: 4.5.0rc1

vasil-pashov avatar Aug 01 '24 08:08 vasil-pashov