ehrapy icon indicating copy to clipboard operation
ehrapy copied to clipboard

Feature request: data missingess indicators

Open eroell opened this issue 11 months ago • 3 comments

Description of feature

Quite straightforward but nice to have as API call as this is a high-volume operation, e.g. ed.dt.missingness_indicator(edata) with e.g. default new layer missingness_indicator or mask or missingness_mask or so being created.

eroell avatar Jan 25 '25 16:01 eroell

This should be in ehrapy, right?

Zethson avatar May 15 '25 13:05 Zethson

This function could look like this:

def missing_data_mask(edata: EHRData, layer: str| None = None, mask_values: Iterable | None = None, key_added: str = missing_data_mask, copy: bool =False)

with e.g. a default layer being created called

and be used like this:

ehrapy.pp.missing_data_indicator(edata)

where a new layer is created in edata.layers, that could by default be called , but be modified by the argument key_added.

By default, this could mask np.nan values, but also put the mask array to 1 where values from mask_values are encountered.

I would include the word mask, for the reason that this term has widespread use, and is used in e.g. numpy in related contexts.

eroell avatar Sep 08 '25 06:09 eroell

Good first issue for e.g. Shenbo

eroell avatar Sep 25 '25 06:09 eroell