Pedro Ribeiro

Results 53 issues of Pedro Ribeiro

Currently, successive halving reduces computation by subsampling the number of rows. Every time the budget increases, the existing parent population is re-evaluated at the next budget, which may be inefficient....

enhancement

**Regressors:** We include ElasticNetCV, LassoLarsCV, SGDRegressor, and RidgeCV. These are all basically linear regression. Let's remove all but SGDRegressor (or maybe ElasticNetCV). SGDRegressor can also potentially overlap with SVR. Or...

enhancement

get_pareto_frontier is faster and only computes the best pareto front. (get_pareto_front calculates all the front, not just the best one.)

enhancement

During the evolutionary algorithm, TPOT2 will fit the same exact data to the same estimator. Ideally, we should be able to catch this and use a cached version of the...

enhancement

Similar to https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7528347/ They implemented covariate adjustment with custom sklearn module wrappers, but this is a bit of a hacky workaround that would not be ideal for TPOT2. Instead, this...

enhancement

It looks at the evaluated_individuals log and generates a plot of time or generations vs. each objective function. This would be helpful for debugging and may provide a visual indication...

enhancement

Graph pipeline plots could be improved. The graphviz package may be worth looking into. other DAG sklearn pipeline packages have nice graphs, for example Baikal and FEDOT

enhancement

We could have better thread efficiency for early stopping with thresholds (when not used with selection early stopping). Currently, all threads need to come together at the end of each...

enhancement

This should be rare. The order of inputs is determined alphabetically. For this to be an issue we would have to have two modules of the same type, each with...

enhancement

see https://scikit-learn.org/stable/modules/generated/sklearn.feature_selection.SelectorMixin.html

bug