pandas
pandas copied to clipboard
ENH: Totality validation for merge operation
- [x] closes #58547
- [x] Tests added and passed if fixing a bug or adding a new feature
- [x] All code checks passed.
- [x] Added type annotations to new arguments/methods/functions.
- [x] Added an entry in the latest
doc/source/whatsnew/v3.0.0.rst
file if fixing a bug or adding a new feature.
Hmm I'm not too sure about adding this - the validation keyword strictly deals with cardinality today. This seems to blur that scope a little.
Does a library like great expectations not already have something to solve your need?
@WillAyd I would argue that adding totality has a special significance. It would enable the validation of the cardinalities defined by the entity relationship model.
This defines a clear scope for what the validate
option's itention is and also adds compatability with the most common modeling language for merges.
Also the existing validations are useful to avoid unwanted data duplication. My validations would help avoiding unwanted data loss during merge. I think the new validations are just the complementary extension of the existing ones.
Thanks for the PR, but I am also skeptical of adding this into pandas as well. In addition, it appears the original issue has not been triaged yet so a PR is a bit early for this stage - discussion on the issue is still need regarding acceptance and implementation, so closing