HypEx icon indicating copy to clipboard operation
HypEx copied to clipboard

[BUG] value error in matching

Open tikhomirovd opened this issue 9 months ago • 0 comments

🐛 Bug Description

A ValueError occurs when executing the Matching model with a Dataset object in HypEx. The error message indicates a length mismatch in the expected vs. new index values.

Steps To Reproduce

  1. Create a Dataset object with specific roles.
  2. Pass the dataset to the Matching model.
  3. Execute mtchng_mdl.execute(data=hypex_df).
  4. Observe the ValueError.

Expected Behavior

The Matching model should execute without errors, correctly handling dataset indexing.

Environment

  • HypEx Version: [e.g. 1.0.0]
  • Python Version: [e.g. 3.8]
  • Operating System: [e.g. iOS, Windows, Linux]

Additional Context

Full traceback:

ValueError: Length mismatch: Expected axis has 712395 elements, new values have 373945 elements

Possible Solution

Potential issues:

  • Mismatch in indexing when concatenating treatment and control groups.
  • Ensure index values align correctly between merged datasets.
  • Investigate dataset manipulation within Bias.execute().

Code Sample

import hypex
from hypex.dataset import Dataset, InfoRole, TreatmentRole, TargetRole, FeatureRole

hypex_df = Dataset(
    roles={
        "epk_id":         InfoRole(str),
        "treatment_flag": TreatmentRole(int),
        "fake_target":    TargetRole(int)
    },
    data=hypex_mtchng_df,
    default_role=FeatureRole()
)

mtchng_mdl = hypex.Matching(
    group_match=False,
    distance="mahalanobis",
    bias_estimation=True,
    quality_tests="auto"
)
results = mtchng_mdl.execute(data=hypex_df)

Checklist

  • [ ] I have described the bug in detail
  • [ ] I have provided steps to reproduce
  • [ ] I have provided the expected behavior
  • [ ] I have provided screenshots (if applicable)
  • [ ] I have provided my environment details
  • [ ] I have suggested a possible solution (if applicable)

tikhomirovd avatar Mar 06 '25 08:03 tikhomirovd