Feature request: data missingess indicators
Description of feature
Quite straightforward but nice to have as API call as this is a high-volume operation, e.g.
ed.dt.missingness_indicator(edata) with e.g. default new layer missingness_indicator or mask or missingness_mask or so being created.
This should be in ehrapy, right?
This function could look like this:
def missing_data_mask(edata: EHRData, layer: str| None = None, mask_values: Iterable | None = None, key_added: str = missing_data_mask, copy: bool =False)
with e.g. a default layer being created called
and be used like this:
ehrapy.pp.missing_data_indicator(edata)
where a new layer is created in edata.layers, that could by default be called , but be modified by the argument key_added.
By default, this could mask np.nan values, but also put the mask array to 1 where values from mask_values are encountered.
I would include the word mask, for the reason that this term has widespread use, and is used in e.g. numpy in related contexts.
Good first issue for e.g. Shenbo