deepdiff icon indicating copy to clipboard operation
deepdiff copied to clipboard

feature: optional pandas and polars support

Open TyberiusPrime opened this issue 1 year ago • 2 comments

Fixes #394.

I recently ran into an issue where my pipipegraph2 failed to to recalculate nodes downstream of a changed output because deepdiff assigned the same hash to different DataFrames.

Turns out, it was essentially only hashing the column names.

This PR fixes that for pandas, and while I had it open, for polars as well.

The code paths are optional on a successful pandas/polars import.

The added tests of course require pandas and polars. I tried for both with the older versions I listed in requirements-dev.txt and the current versions

I observe 3 failing & 3 error test cases here locally, but they also failed before I touched the code, so I'll blame them on my local venv.

TyberiusPrime avatar Jun 28 '24 16:06 TyberiusPrime

Hi @TyberiusPrime Thanks for the PR! Can you please make your PR against the dev branch, not the master branch? There are some conflicts with your PR against the dev branch. Please ping me once you have updated the PR!

seperman avatar Jun 28 '24 17:06 seperman

My apologies, I had rebased against dev before creating the PR (but after starting the creation...) and github somehow didn't pick that up.

Give me a minute to learn how to fix this.

edit: Turns out it's as easy as hitting 'edit' at the top and selecting a new target branch. Now the diff looks much more reasonable as well.

TyberiusPrime avatar Jun 28 '24 17:06 TyberiusPrime

LGTM! Thanks @TyberiusPrime There is a minor bug in the requirments-dev.txt of your PR. I will fix it.

seperman avatar Jul 01 '24 19:07 seperman

@TyberiusPrime DeepDiff 8.0.0 is published and it includes your contribution. Thank you!

seperman avatar Aug 27 '24 22:08 seperman