Pierre-Paul De Breuck
Pierre-Paul De Breuck
https://github.com/ppdebreuck/modnet/blob/a5ab7c66e7c8a35f5bf0fe455945ed513f2e1f11/modnet/hyper_opt/fit_genetic.py#L94 introduced in https://github.com/ppdebreuck/modnet/pull/198 will be problematic in python 3.11. Solution: add sorted()
If some features are missing (from the optimal ones), when featurizing novel compounds (for predicting stage), `predict()` will not work. This happens for instance when some elements are not present...
See https://github.com/ppdebreuck/modnet/blob/1fbf7b2f45aee5970ee3f5c6fd88461faefdbc65/modnet/preprocessing.py#L319 Not used at this stage but may be integrated to get RR score. _Reported by @gbrunin_
The [feature selection](https://github.com/ppdebreuck/modnet/blob/1fbf7b2f45aee5970ee3f5c6fd88461faefdbc65/modnet/preprocessing.py#L770) procedure relies on `mutual_info_regression` & `mutual_info_classif`which is stochastic. Therefore, feature selection is nondeterministic (the ranked list of optimal descriptors can have slight changes from run to run)....
The sklearn API implementation of MODNet can be found *under modnet.sklearn*. It enables integration with scikit-learn methods such as pipelines, model selection functions (e.g. gridsearch), and integration with other sklearn...