scikit-matter
scikit-matter copied to clipboard
From docs it is not super clear that sample selection works analogously to feature selection
We have even in the examples a section Feature and Sample Selection
, but no example notebook.
https://scikit-matter.readthedocs.io/en/latest/tutorials.html
Not too sure what exactly you mean by this. In the API-reference for Feature and Sample Selection it states that:
"scikit-matter contains multiple data sub-selection modules, primarily corresponding to methods derived from CUR matrix decomposition and Farthest Point Sampling. In their classical form, CUR and FPS determine a data subset that maximizes the variance (CUR) or distribution (FPS) of the features or samples. These methods can be modified to combine supervised and unsupervised learning, in a formulation denoted PCov-CUR and PCov-FPS. For further reading, refer to [Imbalzano2018] and [Cersonsky2021].
These selectors can be used for both feature and sample selection, with similar instantiations. Currently, all sub-selection methods extend GreedySelector, where at each iteration the model scores each feature or sample (without an estimator) and chooses that with the maximum score."
https://scikit-matter.readthedocs.io/en/latest/selection.html
this is the current tutorials page
I agree that it is written the API, but we had a user who wasn't sure from the examples how to use sample selection. So we can improve this, but changing an example or adding one.