harmonica icon indicating copy to clipboard operation
harmonica copied to clipboard

Cross Validation or Optimization for EquivalentSources

Open mdtanker opened this issue 1 year ago • 2 comments

Description of the desired feature:

Is there a reason there is no cross-validation equivalent for harmonica.EquivalentSources like there is for verde.SplineCV? I find myself quite often manually running cross-validations for the equivalent source parameters as shown in the Estimating damping and depth parameters user guide:

dampings = [0.01, 0.1, 1, 10,]
depths = [5e3, 10e3, 20e3, 50e3]

equivalent_sources = hm.EquivalentSources()

scores = []
for params in parameter_sets:
    equivalent_sources.set_params(**params)
    score = np.mean(
        vd.cross_val_score(
            equivalent_sources,
            coordinates,
            data.gravity_disturbance_mgal,
        )
    )
    scores.append(score)

It would be great to be able to instead use


eqs = hm.EquivalentSourcesCV(
    dampings=[0.01, 0.1, 1, 10,], 
    depths=[5e3, 10e3, 20e3, 50e3],
)

eqs.fit(coordinates, data.gravity_disturbance_mgal,)

Also, related to this verde issue; I have an equivalent function for performing an optimization for the optimal equivalent source parameters, instead of a grid-search which would be done with the above cross-validation.

Are you willing to help implement and maintain this feature?

Yes if there is enough interest to justify doing this.

mdtanker avatar Nov 26 '24 18:11 mdtanker

Hi @mdtanker! I don't think there's any special reason besides that no one wrote such class.

Whenever I needed to do cross validation for eq sources I was doing the same thing as you. Maybe it would be nice to have a class for it instead.

I don' t have an immediate use for it, but if you want to code it, feel free to do so!

santisoler avatar Dec 06 '24 19:12 santisoler

Ok good to know, I'll see if I can find some time to make a PR for hm.EquivalentSourcesCV.

Any thoughts on the optimization aspect of this? Same idea as https://github.com/fatiando/verde/issues/474 where instead of a grid-search CV, we use a hyperparameter optimization where the users defines the number of trials and the max/min values of damping/depth.

I guess we should first implement the simple CV and then consider adding the optimization.

mdtanker avatar Dec 08 '24 21:12 mdtanker