arkouda
arkouda copied to clipboard
Increase multidimensional testing for dataframe module
Increase multidimensional testing for dataframe module
In this module, multi-dim DataFrame columns should be disabled (thrown an error if a multi-dim pdarray is entered). Then the unit tests should verify the error is thrown.
Where to add the check (high priority)
- DataFrame.init(...) : Guard every column coming from dicts, lists, pandas frames, etc.
- DataFrame.setitem(key, value) : Covers df["col"] = value and df.loc[:, "col"] = value.
- DataFrame.concat([...], axis=1) / classmethod or module function that builds a new DF from columns : Ensure the result’s columns are validated (ideally by routing through the constructor).
- DataFrame.append(...) : Route to the same validation as concat/constructor.
- merge(...) / DataFrame.merge(...) (the constructor path for the merged frame) : Make sure the merged output’s columns go through the same validation (again, easiest if the constructor enforces it).
- Any internal helpers that materialize new frames from column dicts
- e.g., a _from_dict_of_columns(...) or _new_like(...) helper — add the check or ensure they call the constructor which enforces it.
Something like this would help:
def _ensure_1d_column(name: str, arr):
# Accept Arkouda arrays and friends; treat non-array scalars/lists normally.
from arkouda import pdarray, Strings, Categorical, SegArray # adjust as needed
if isinstance(arr, (pdarray, Strings, Categorical, SegArray)):
# SegArray is conceptually 1-D (of segments), so allow it.
# Strings/Categorical in Arkouda are inherently 1-D.
if hasattr(arr, "ndim") and getattr(arr, "ndim", 1) != 1:
raise ValueError(
f"DataFrame columns must be 1-D; column '{name}' has ndim={arr.ndim}."
)
return arr