spark-fast-tests icon indicating copy to clipboard operation
spark-fast-tests copied to clipboard

assertApproximateDataFrameEquality with ignoreNullable and orderedComparison flags

Open ggalves opened this issue 5 years ago • 11 comments

The problem: I am comparing two dataframes with a DoubleType field, and this comparision needs to be approximate. But I'm having problems with schema because of nullable fields.

This flags would be very useful for my case. Is this possible?

ggalves avatar Apr 15 '19 18:04 ggalves

@ggalves - Sure, I added the ignoreNullable feature in this commit: https://github.com/MrPowers/spark-fast-tests/commit/3ea8d89d15bcee8452249f9e059b824caf658948

I will try to get on the orderedComparison feature soon. Little too sleepy to do it right now.

MrPowers avatar Apr 16 '19 04:04 MrPowers

@ggalves - Have you looked at the assertDoubleTypeColumnEquality method? That might help you solve your problem too as that method doesn't look at the nullable property or the column name. Only downside is that it can only compare two columns, not two entire DataFrames.

MrPowers avatar Apr 16 '19 04:04 MrPowers

Thank you for the feature!

About assertDoubleTypeColumnEquality, i haven't looked yet, may be interesting for my case. I'm taking a look at this today. Thanks again!

ggalves avatar Apr 16 '19 14:04 ggalves

@ggalves - I updated the assertApproximateDataFrameEquality method to optionally take the ignoreNullable and orderedComparison flags. The new code has been released in v0.19.1. Let me know how this code works for you!

MrPowers avatar Apr 17 '19 02:04 MrPowers

@MrPowers thank you! But have you released v0.19.1? I couldn't find the release/tag

ggalves avatar Apr 17 '19 16:04 ggalves

@ggalves - I'm grabbing it from Maven (example here), but you're right, I forgot to do a release.

Here's the release link: https://github.com/MrPowers/spark-fast-tests/releases/tag/v0.19.1

Let me know if this works for you!

MrPowers avatar Apr 18 '19 02:04 MrPowers

@ggalves - Were you able to find the release OK? Can I close this issue?

MrPowers avatar Apr 25 '19 21:04 MrPowers

I'm sorry, I couldn't test it yet. I'm testing it tomorrow.

ggalves avatar Apr 25 '19 22:04 ggalves

@MrPowers I tested it just now. ignoreNullable is working, but I couldn't find orderedComparison flag in assertApproximateDataFrameEquality

ggalves avatar Apr 26 '19 17:04 ggalves

@ggalves - Think I finally fixed this: https://github.com/MrPowers/spark-fast-tests/commit/28351e772a48e7968b6e90e674d1fb2b2853844f

Sorry for keeping on messing this one up. See version 0.19.2.

Let me know if there is anything else you need.

MrPowers avatar May 08 '19 01:05 MrPowers

@MrPowers This commit is what I need, thank you! Can you release 0.19.2, please?

ggalves avatar May 08 '19 15:05 ggalves