pandas icon indicating copy to clipboard operation
pandas copied to clipboard

ENH: Totality validation for merge operation

Open z3rone opened this issue 9 months ago • 2 comments

  • [x] closes #58547
  • [x] Tests added and passed if fixing a bug or adding a new feature
  • [x] All code checks passed.
  • [x] Added type annotations to new arguments/methods/functions.
  • [x] Added an entry in the latest doc/source/whatsnew/v3.0.0.rst file if fixing a bug or adding a new feature.

z3rone avatar May 06 '24 20:05 z3rone

Hmm I'm not too sure about adding this - the validation keyword strictly deals with cardinality today. This seems to blur that scope a little.

Does a library like great expectations not already have something to solve your need?

WillAyd avatar May 16 '24 21:05 WillAyd

@WillAyd I would argue that adding totality has a special significance. It would enable the validation of the cardinalities defined by the entity relationship model.

This defines a clear scope for what the validate option's itention is and also adds compatability with the most common modeling language for merges.

z3rone avatar May 17 '24 06:05 z3rone

Also the existing validations are useful to avoid unwanted data duplication. My validations would help avoiding unwanted data loss during merge. I think the new validations are just the complementary extension of the existing ones.

z3rone avatar May 27 '24 07:05 z3rone

Thanks for the PR, but I am also skeptical of adding this into pandas as well. In addition, it appears the original issue has not been triaged yet so a PR is a bit early for this stage - discussion on the issue is still need regarding acceptance and implementation, so closing

mroeschke avatar Jun 17 '24 16:06 mroeschke