gplearn icon indicating copy to clipboard operation
gplearn copied to clipboard

Pareto front

Open ianhbell opened this issue 7 years ago • 4 comments

Has there been any thought given to pareto front optimization? There's always a tradeoff between tree size and model fidelity, which I gather you handle with parsimony. But the other alternative is to keep any model that is non-dominated by the pareto front. I couldn't see any clear way of hacking that into gplearn.

ianhbell avatar May 02 '17 22:05 ianhbell

Sounds interesting @ianhbell ... Got a citation in mind?

trevorstephens avatar May 06 '17 09:05 trevorstephens

This should be a good point to start reading: https://www.iitk.ac.in/kangal/Deb_NSGA-II.pdf

Ohjeah avatar May 06 '17 23:05 Ohjeah

Hi, @ianhbell

Just for ciriosity, if I define a complexity measure (yielding the number of nodes in the tree representation of an expression), and use this complexity measure inside my custom fitness, a bit like so

from sklearn.metrics import r2_score
def my_custom_fitness(expr, X, y_true):
    y_pred = make_prediction(expr, X)
    return r2_score(y_pred, y_true) - (complexity(expr) / 1000)

Therefore:

  • for two expressions yielding the same r2_score, my_custom_fitness would favour the simplest one
  • for two expressions having the same complexity (i.e the first one is as simple as the second one), my_custom_fitness would the one that yields the best r2_score

Given these properties, the expression found at the end of fit would be on the pareto front (at least the one drawn considering all evaluated expressions)

Am I missing something ?

remiadon avatar Jan 21 '22 18:01 remiadon

Answering to myself with a reference

PARETO-FRONT EXPLOITATION IN SYMBOLIC REGRESSION

Written at page 294 :

There is, however, a significant difference between using a Pareto front as a post-run analysis tool vs. actively optimizing the Pareto front during a GP-run. In the latter case the Pareto front becomes the objective that is being optimized instead of the fitness (accuracy) of the “best” mode

So yes, I was missing something big

remiadon avatar Jan 23 '22 11:01 remiadon