VerticaPy
VerticaPy copied to clipboard
[Pipeline] Underlying SQL Metrics
Description:
There is currently no way to generate the SQL to make a metric table.
Tasks:
- [ ] machine_learning/metrics/classification.py: Create a way to get the underlying SQL of the metrics
- [x] machine_learning/metrics/regression.py: Adding an additional parameter to
regression_reportto return the SQL of the metric instead of the result of the metrics.
Definition of Done:
- SQL code generation is possible for regression and classification.
Concerns:
An example to show we really don't use sql to compute classification anymore:
- how
accuracy_scoreused to be computed in _metrics.py the 0.12.0 version of VerticapyAVG(CASE WHEN {0} = {1} THEN 1 ELSE 0 END) - how
accuracy_scoreis computed now in classification.py in 1.0.0
def accuracy_score(...):
return _compute_final_score(
_accuracy_score,
**locals(),
)
def _accuracy_score(...):
return (tp + tn) / (tp + tn + fn + fp)
def confusion_matrix(...) -> np.ndarray:
res = _executeSQL(
query=f"""
SELECT
CONFUSION_MATRIX(obs, response
USING PARAMETERS num_classes = 2) OVER()
FROM
(SELECT
DECODE({y_true}, '{pos_label}',
1, NULL, NULL, 0) AS obs,
DECODE({y_score}, '{pos_label}',
1, NULL, NULL, 0) AS response
FROM {input_relation}) VERTICAPY_SUBTABLE;""",
title="Computing Confusion matrix.",
method="fetchall",
)
return np.round(np.array([x[1:-1] for x in res])).astype(int)
def _compute_final_score(...):
cm = confusion_matrix(y_true, y_score, input_relation, **kwargs)
return _compute_final_score_from_cm(metric, cm, average=average, multi=multi
@zacandcheese did you find any solution for this one?