SDMetrics icon indicating copy to clipboard operation
SDMetrics copied to clipboard

Support `kwargs` in MLEfficacy metrics to customize the model

Open npatki opened this issue 2 years ago • 0 comments

Filed from a conversation on the public, SDV Slack. This is a lower priority feature request.

Problem Description

The single table ML Efficacy metrics make use of different ML algorithms from sklearn. For example, the MulticlassMLPClassifier uses this implementation under-the-hood.

The sklearn implementation has many parameters that you can tune, for example max_iter or activation. However, all these parameters are hardcoded in SDMetrics right now without an API to change them.

Expected behavior

I would expect the ability to change the parameters of the underlying sklearn model. Exact API may vary but one possibility is just to provide a dictionary:

from sdmetrics.single_table import MulticlassMLPClassifier

MulticlassMLPClassifier.compute(
    test_data=real_data,
    train_data=synthetic_data,
    target='categorical_column_name',
    metadata=metadata
    model_parameters={'max_iter': 500, 'activation': 'identity'}
)

Workaround

In the meantime, a workaround is to access the hardcoded parameters from the class, and modify them.

from sdmetrics.single_table import MulticlassMLPClassifier

# hardcode the parameters in the class itself
MulticlassMLPClassifier.MODEL_KWARGS = { 'max_iter': 500, 'activation': 'identity' }

# use the class to compute a score
MulticlassMLPClassifier.compute(
    test_data=real_data,
    train_data=synthetic_data,
    target='categorical_column_name',
    metadata=metadata
)

npatki avatar Nov 17 '22 23:11 npatki