datafusion-comet icon indicating copy to clipboard operation
datafusion-comet copied to clipboard

Create integration tests that can run against a Spark cluster

Open andygrove opened this issue 9 months ago • 0 comments

What is the problem the feature request solves?

We recommend using off-heap memory when running Comet but all of our unit tests and the Spark SQL tests run with on-heap memory as far as I know. These tests are also running in local mode, which is also not how users will use the product.

I would like us to have an integration test suite that can run against a Spark cluster. This could be a standalone cluster with one executor in CI, but would also allow us to run the suite against distributed clusters, which can sometimes exhibit diffferent behavior than single executor usage.

We can configure the cluster to use off-heap memory.

This integration suite could be implemented in PySpark or Scala.

Describe the potential solution

No response

Additional context

No response

andygrove avatar Mar 12 '25 19:03 andygrove