pyDVL icon indicating copy to clipboard operation
pyDVL copied to clipboard

Inconsistent test behavior

Open schroedk opened this issue 1 year ago • 1 comments

Running the test test_classwise_scorer_accuracies_manual_derivation locally on a Macbook M2 fails with an AssertionError, in contrast to CI running on Linux, where everything is fine.

schroedk avatar Dec 19 '23 09:12 schroedk

I get the same problem on a Macbook M3. The error is assert 0.75 == 0.0

janosg avatar Apr 05 '24 13:04 janosg

The reason for the inconsistent test behavior is that:

np.array([np.nan]).astype(int)[0] == -9223372036854775808  (on linux-amd64 and maybe other systems)
np.array([np.nan]).astype(int)[0] == 0  (on osx-arm64)

see also this numpy issue.

According to this, I would say the test is flawed and should be rewritten from scratch or if possible just removed.

schroedk avatar May 06 '24 16:05 schroedk

The reason for the occurrence of np.nanis the following code:

class ClosedFormLinearClassifier:
    def __init__(self):
        self._beta = None

    def fit(self, x: NDArray, y: NDArray) -> float:
        v = x[:, 0]
        self._beta = np.dot(v, y) / np.dot(v, v)
        return -1

    def predict(self, x: NDArray) -> NDArray:
        if self._beta is None:
            raise AttributeError("Model not fitted")

        x = x[:, 0]
        probs = self._beta * x
        return np.clip(np.round(probs + 1e-10), 0, 1).astype(int)

    def score(self, x: NDArray, y: NDArray) -> float:
        pred_y = self.predict(x)
        return np.sum(pred_y == y) / 4

in tests/value/shapley/test_classwise.py. In the fit step, v is zero and thus self._beta is np.nan.

schroedk avatar May 06 '24 17:05 schroedk

@kosmitive do you have time to support on this? I created a draft PR with a temporary fix.

schroedk avatar May 06 '24 17:05 schroedk