datafusion-comet
datafusion-comet copied to clipboard
Create integration tests that can run against a Spark cluster
What is the problem the feature request solves?
We recommend using off-heap memory when running Comet but all of our unit tests and the Spark SQL tests run with on-heap memory as far as I know. These tests are also running in local mode, which is also not how users will use the product.
I would like us to have an integration test suite that can run against a Spark cluster. This could be a standalone cluster with one executor in CI, but would also allow us to run the suite against distributed clusters, which can sometimes exhibit diffferent behavior than single executor usage.
We can configure the cluster to use off-heap memory.
This integration suite could be implemented in PySpark or Scala.
Describe the potential solution
No response
Additional context
No response