ArcticDB
ArcticDB copied to clipboard
Allow unordered indexes in staged area when `sort_and_finalize_staged_data` is used
Is your feature request related to a problem? Please describe.
Currently sort_and_finalize_staged_data
the indexes in all segments to be sorted. Or an exception is thrown.
import pandas as pd
import numpy as np
import arcticdb as adb
ac = adb.Arctic("lmdb://test")
lib = ac.get_library("test", create_if_missing=True)
dates = [np.datetime64('2023-01-03'), np.datetime64('2023-01-01'), np.datetime64('2023-01-05')]
df = pd.DataFrame({"col": [2, 1, 3]}, index=dates)
lib.write("sym", df, staged=True)
lib.sort_and_finalize_staged_data("sym")
Output:
Traceback (most recent call last):
File "...\test.py", line 9, in <module>
lib.write("sym", df, staged=True)
File "...\arcticdb\version_store\library.py", line 461, in write
return self._nvs.write(
File "...\arcticdb\version_store\_store.py", line 583, in write
self.version_store.write_parallel(symbol, item, norm_meta, udm)
arcticdb_ext.exceptions.UnsortedDataException: E_UNSORTED_DATA When writing/appending staged data in parallel, input data must be sorted.
Describe the solution you'd like
Allow unordered indexes in staged segments and sort then when sort_and_finalize_staged_data
is called.