datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Improve `checkSparkAnswer` methods to allow for validation of expected results

Open andygrove opened this issue 1 month ago • 3 comments

What is the problem the feature request solves?

When writing tests using checkSparkAnswerAndOperator, the testing API doesn't provide a convenient way to verify that the query actually produces the ~correct~ expected results, or any results at all.

I would like to improve the API to make it easy to improve our tests to verify results above and beyond just making sure that Comet and Spark produced the same results.

Describe the potential solution

No response

Additional context

No response

andygrove avatar Nov 05 '25 15:11 andygrove

I'm thinking can we reuse the same mechanism we use for fuzz testing correctness?

for my local testing I typically reuse fuzz correctness

comphead avatar Nov 05 '25 19:11 comphead

I'm thinking can we reuse the same mechanism we use for fuzz testing correctness?

for my local testing I typically reuse fuzz correctness

I didn't explain well enough in the description. What I mean is that we have no way to see if the query is returning the expected results. If I make a mistake in my query, I may not be testing what I think I am testing. Without seeing the result set, I could miss this.

edit: For example, if I write:

checkSparkAnswerAndOperator("SELECT upper(foo) FROM bar WHERE foo LIKE 'abc%'")

If there are no rows matching the where clause then the query will return zero rows both with Spark and Comet and the test will pass, but isn't really testing anything.

andygrove avatar Nov 05 '25 21:11 andygrove

Notes to future self for when I have time to work on this.

We currently have many variations of checkSparkAnswer:

  protected def checkSparkAnswer(query: String): (SparkPlan, SparkPlan) = {
  protected def checkSparkAnswer(df: => DataFrame): (SparkPlan, SparkPlan) = {
  protected def checkSparkAnswerWithTolerance(
  protected def checkSparkAnswerWithTolerance(
  protected def checkSparkAnswerAndOperator(
  protected def checkSparkAnswerAndOperator(
  protected def checkSparkAnswerAndOperator(
  protected def checkSparkAnswerAndOperatorWithTol(
  protected def checkSparkAnswerAndOperatorWithTol(
  protected def checkSparkAnswerAndFallbackReason(
  protected def checkSparkAnswerAndFallbackReason(
  protected def checkSparkAnswerAndFallbackReasons(
  protected def checkSparkAnswerMaybeThrows(

I would like to explore an approach that provides a highly customizable version of the method.

Some possible approaches:

Builder pattern for a spec to pass to the test

checkSparkAnswerAndOperator(CheckSparkAnswer.builder()
  .withTolerance(0.000001)
  .allowOperators(classOf[ProjectExec], classOf[HashAggregateExec])
  .build())

Use closure for custom checks

checkSparkAnswer(df, test => test.assertRowCount(42).withTolerance(0.000001))

checkSparkAnswer(df, test => test.inspectResults((sparkDF, cometDF) => { .. custom logic here .. })

andygrove avatar Nov 07 '25 15:11 andygrove