formulaic icon indicating copy to clipboard operation
formulaic copied to clipboard

Add narwhals materializer for dataframe agnosticism.

Open matthewwardrop opened this issue 6 months ago • 4 comments

This patch adds initial support for narwhals. It's... patchy.

You can try it out using:

import narwhals as nw
import pandas as pd
from formulaic import model_matrix

df = pd.DataFrame({"a": [1,2,None], "b": list("abc"), "c": pd.Categorical(list("abc"))})
model_matrix("a + b + c", df, materializer='narwhals', na_action="ignore")

image

import polars
model_matrix("a + b + c", polars.DataFrame._from_pandas(df), materializer='narwhals', na_action="ignore")

image

Note: The polars backend panics when na_action is not ignore.

There's a lot of hacks here, including fallbacks to pandas objects in places, and of course we still want sparse materialisation to work (which I don't think other backends support sufficiently to replace scipy sparse matrices).

matthewwardrop avatar Aug 06 '24 23:08 matthewwardrop