redflag icon indicating copy to clipboard operation
redflag copied to clipboard

Safety net for machine learning pipelines. Plays nice with sklearn and pandas.

Results 65 redflag issues
Sort by recently updated
recently updated
newest added

Adopt the simpler approach to dynamic versioning I'm using here https://github.com/scienxlab/python-package-template Consider dropping `__version__` completely, rationale: https://github.com/pypa/packaging.python.org/pull/1276#issuecomment-1646696925

maintenance

In a regression task, it's good practice to compute **interactions** and **nonlinear transformations**, eg via polynomial basis expansion. It should not be too hard to detect if this has been...

enhancement

Lasso may be a better indicator of feature importance, as it tries to eliminate features. But the alpha parameter needs to be tuned, eg with https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LassoCV.html

enhancement
idea

@kwinkunks Thanks a lot for creating a nice package. Here are some more detailed comments that I think could be helpful. 1. `wasserstein` could return a pandas DataFrame with appropriate...

bug

Can estimate precision using beta distribution: https://www.rikvoorhaar.com/validation-size/ Perhaps could also model the uncertainty on the accuracy estimate for the user, but that seems maybe off-topic.

enhancement

Would be nice to have... Here's one way https://github.com/steinwurf/versjon

documentation
good first issue

Could be another way to measure the similarity between datasets. From the `twinning` repo: https://github.com/avkl/twinning > `energy()` computes the energy distance (Székely & Rizzo, 2013) between a given dataset and...

enhancement

Can't use (say) +/- 3 standard deviations if feature is non-Gaussian. So apply transformation first, eg with Yeo-Johnson transformation, see https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.PowerTransformer.html and also #46

bug

Not filter! Need to find out what's causing them and fix it. Too much noise from warnings: ``` src/redflag/distributions.py::redflag.distributions.best_distribution overflow encountered in divide src/redflag/distributions.py::redflag.distributions.cv_kde src/redflag/distributions.py::redflag.distributions.fit_kde src/redflag/distributions.py::redflag.distributions.get_kde src/redflag/distributions.py::redflag.distributions.is_multimodal src/redflag/distributions.py::redflag.distributions.is_multimodal src/redflag/distributions.py::redflag.distributions.kde_peaks Data...

bug
testing

- Great Expectations - seems big and cumbersome - lesson: stay lean - Evidently - "framework to evaluate, test and monitor ML models in production." - looks nice but quite...

documentation