ArcticDB
ArcticDB copied to clipboard
ArcticDB is a high performance, serverless DataFrame database built for the Python Data Science ecosystem.
**Describe the bug** If the staged segment is of only empty DataFrames (`pd.DataFrame([])`) calling `sort_and_finalize_staged_data` throws. **Steps to reproduce** ```python import pandas as pd import numpy as np import arcticdb...
**Describe the bug** If nothing is added to the staged segment and `sort_and_finalize_staged_data` an exception is thrown. **Steps to reproduce** ```python import pandas as pd import numpy as np import...
**Describe the bug** When `sort_and_finalize_staged_data` is used with `mode=StagedDataFinalizeMethod.APPEND` it allows to append index such that the overall index becomes out of order. For example it's possible for the appended...
**Is your feature request related to a problem? Please describe.** Currently `StreamId`s can be quite large strings and in various places around the code we need to worry about performance...
The target of this ticket is to make data still readable even bad metadata has been written 1. Make `strict_mode` useful [again ](https://github.com/man-group/ArcticDB/blob/b9c75db258672a9fbc09db3f433f733f97c4acca/python/arcticdb/version_store/_normalization.py#L925) to avoid msgpack data being pickled silenetly...
WIP In the case where we don't need snapshot information or deleted versions, we can get the latest version information more quickly from the new-style symbol list
Currently `list_versions` iterates version of each symbol one by one. Although each iteration could only take ms, it could take a long time if target symbols are more than 100k.
Currently segments are read from storage in a specific order based on the first clause in the pipeline, and subsequent clauses will only work if the output format from the...
#### Reference Issues/PRs https://github.com/man-group/ArcticDB/issues/1200 #### What does this implement or fix? Add support for Azure AD default credential. Bind `DefaultAzureCredential` in azure C++ SDK to python layer for the support....
⚠️ API change ⚠️ Fixes #1641 Also changes `write_pickle_batch` to return `DataError` objects instead of throwing exceptions, and updated the return type and docstring accordingly.