[BUG] ModernNCA classifier issue

Open peanutshu11 opened this issue 8 months ago • 1 comments

Describe the bug i'm new to this field, so pls explain for me if i'm doing anything wrong

the predict_proba of ModernNCAClassifier always return greater than 0.5, which results in 1 for binary classification

To Reproduce

 model = ModernNCAClassifier(
        use_embeddings=True, embedding_type='plr',
        cat_cutoff=0.03,
    )
model.fit(X_train, y_train, max_epochs=20)

y_pred = model.predict(X_test)
y_scores = model.predict_proba(X_test)   
print(y_scores)

results in

[[0.543139  ]
 [0.73105836]
 [0.5127567 ]
 [0.5000971 ]
 [0.5000005 ]
 [0.50000066]
 [0.50002706]
 [0.50000036]
 [0.5704578 ]
 [0.50000006]
 [0.5000001 ]
 [0.5000001 ]
 [0.5000008 ]
 [0.50000024]
....

The prediction can get very close, but never results in lower than 0.5

Desktop (please complete the following information):

OS: Window 11
Python version 3.12.5
Mambular Version 1.5.0

Additional context i'm using this dataset https://www.kaggle.com/datasets/ankitverma2010/ecommerce-customer-churn-analysis-and-prediction

May 01 '25 16:05 peanutshu11

Hi @Gitnut11, Thank you lot for reporting this and for sharing all the context so clearly — that really helps.

We will look into the issue where ModernNCAClassifier.predict_proba() always seems to return values ≥ 0.5, effectively biasing towards class 1 in binary classification. At first glance, a few things might be causing this behavior:

Sigmoid over distance scores: If the internal logic maps distances (or similarities) to probabilities using a sigmoid or similar function, and the embeddings are always relatively close to class centroids, then outputs might cluster around 0.5+. This could especially happen if the positive and negative classes are not well-separated.

Use of cat_cutoff: The parameter cat_cutoff=0.03 may influence how strongly the model penalizes certain examples during training — if it's not tuned carefully, it might limit the model’s ability to confidently assign low probabilities (i.e., < 0.5) to some classes.

Lack of margin in learned embeddings: If the learned embeddings don’t spread the classes out far enough in the embedding space (especially in early training or with few epochs like max_epochs=20), distances to both class centroids may end up too close, resulting in probability outputs that are always borderline (around 0.5).

We will dig into the implementation to verify this and will keep you posted on what we find or update if a fix is needed.

Meanwhile, feel free to try increasing max_epochs, adjusting cat_cutoff, or even inspecting the distance values directly to see if the outputs are collapsing.

Appreciate your patience and for flagging this!

Jul 04 '25 14:07 mkumar73