evidently
evidently copied to clipboard
fix: Drop NaNs only in used columns
- Select columns used in CatTargetDriftAnalyzer and then filter Nans and infinities.
- Stop doing the replace and drop of these values inplace.
- Check that columns are not empty and raise a more informative error if so.
Closes #241
Signed-off-by: Daniel J. Morales Velásquez [email protected]
Workflow log before creating the PR: https://github.com/danieljmv01/evidently/actions/runs/2411394304
Rebased and updated _remove_nans_and_infinities
to just keep valid values without replacing data, just in case :
https://github.com/evidentlyai/evidently/blob/fa546f78877b4242a4d5acf482951af9095e9718/src/evidently/pipeline/pipeline.py#L38-L40
Log: https://github.com/danieljmv01/evidently/actions/runs/2415156752
Hi @danieljmv01 , first of all we want to thank you for your pull request. You brought up very important topic and heavily increased the priority of NA values filtering.
We have carefully reviewed you changes (we really like it!), and finally decided to go with the same filtration logic, but alternative technical implementation without the use of masks. This was mainly dictated by the future changes we are aiming to make and ease of support.