health-equity-tracker
health-equity-tracker copied to clipboard
FutureWarning on `merge` in `cdc_hiv`
Describe the bug
I can't figure out how to fix the code to remove this warning:
FutureWarning: In a future version, the Index constructor will not infer numeric dtypes when passed object-dtype sequences (matching Series behavior)
By running the pip install pytest command with a - W error flag at the end, you can cause warnings to be treated as errors, which will then cause the full stack trace to print. Doing this, it shows this stack trace, which makes me think the issue is merging the initial empty df (that has only columns) with the subsequent dfs that have columns and rows, but i don't know.
Stack Trace
.venv/lib/python3.9/site-packages/datasources/cdc_hiv.py:223: in write_to_bq
alls_df = load_atlas_df_from_data_dir(geo_level, all)
.venv/lib/python3.9/site-packages/datasources/cdc_hiv.py:564: in load_atlas_df_from_data_dir
output_df = output_df.merge(df, how="outer")
.venv/lib/python3.9/site-packages/pandas/core/frame.py:9351: in merge
return merge(
.venv/lib/python3.9/site-packages/pandas/core/reshape/merge.py:122: in merge
return op.get_result()
.venv/lib/python3.9/site-packages/pandas/core/reshape/merge.py:738: in get_result
self._maybe_add_join_keys(result, left_indexer, right_indexer)
.venv/lib/python3.9/site-packages/pandas/core/reshape/merge.py:916: in _maybe_add_join_keys
key_col = Index(lvals).where(~mask_left, rvals)
.venv/lib/python3.9/site-packages/pandas/core/indexes/base.py:494: in __new__
arr = _maybe_cast_data_without_dtype(
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
subarr = array([nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan,
nan], dtype=object), cast_numeric_deprecated = True
def _maybe_cast_data_without_dtype(
subarr: np.ndarray, cast_numeric_deprecated: bool = True
) -> ArrayLike:
"""
If we have an arraylike input but no passed dtype, try to infer
a supported dtype.
Parameters
----------
subarr : np.ndarray[object]
cast_numeric_deprecated : bool, default True
Whether to issue a FutureWarning when inferring numeric dtypes.
Returns
-------
np.ndarray or ExtensionArray
"""
result = lib.maybe_convert_objects(
subarr,
convert_datetime=True,
convert_timedelta=True,
convert_period=True,
convert_interval=True,
dtype_if_all_nat=np.dtype("datetime64[ns]"),
)
if result.dtype.kind in ["i", "u", "f"]:
if not cast_numeric_deprecated:
# i.e. we started with a list, not an ndarray[object]
return result
> warnings.warn(
"In a future version, the Index constructor will not infer numeric "
"dtypes when passed object-dtype sequences (matching Series behavior)",
FutureWarning,
stacklevel=3,
)
E FutureWarning: In a future version, the Index constructor will not infer numeric dtypes when passed object-dtype sequences (matching Series behavior)
.venv/lib/python3.9/site-packages/pandas/core/indexes/base.py:7137: FutureWarning
============================================================================================== short test summary info ==============================================================================================
FAILED python/tests/datasources/test_cdc_hiv.py::test_write_to_bq_race_national - FutureWarning: In a future version, the Index constructor will not infer numeric dtypes when passed object-dtype sequences (matching Series behavior)