iceberg icon indicating copy to clipboard operation
iceberg copied to clipboard

Build: Fix & Run spark integration tests on CI

Open nastra opened this issue 3 years ago • 1 comments

While looking at https://github.com/apache/iceberg/issues/5791 I noticed that the Spark integration tests (integrationTest task) was never part of CI. On CI we're running the check task (which basically runs test) and this PR makes sure that integrationTest is part of the check task.

fixes #5791

nastra avatar Sep 21 '22 16:09 nastra

CI should be failing with the below failures, since the issue described in https://github.com/apache/iceberg/issues/5791 is still present

org.apache.iceberg.spark.SmokeTest > testAlterTable[catalogName = testhive, implementation = org.apache.iceberg.spark.SparkCatalog, config = {type=hive, default-namespace=default}] FAILED
    java.lang.NoClassDefFoundError at SmokeTest.java:93
        Caused by: java.lang.ClassNotFoundException at SmokeTest.java:93

org.apache.iceberg.spark.SmokeTest > testGettingStarted[catalogName = testhive, implementation = org.apache.iceberg.spark.SparkCatalog, config = {type=hive, default-namespace=default}] FAILED
    java.lang.NoClassDefFoundError at SmokeTest.java:67
        Caused by: java.lang.ClassNotFoundException at SmokeTest.java:67

org.apache.iceberg.spark.SmokeTest > testAlterTable[catalogName = testhadoop, implementation = org.apache.iceberg.spark.SparkCatalog, config = {type=hadoop}] FAILED
    java.lang.NoClassDefFoundError at SmokeTest.java:93

org.apache.iceberg.spark.SmokeTest > testGettingStarted[catalogName = testhadoop, implementation = org.apache.iceberg.spark.SparkCatalog, config = {type=hadoop}] FAILED
    java.lang.NoClassDefFoundError at SmokeTest.java:67

org.apache.iceberg.spark.SmokeTest > testAlterTable[catalogName = spark_catalog, implementation = org.apache.iceberg.spark.SparkSessionCatalog, config = {type=hive, default-namespace=default, parquet-enabled=true, cache-enabled=false}] FAILED
    java.lang.NoClassDefFoundError at SmokeTest.java:93

org.apache.iceberg.spark.SmokeTest > testGettingStarted[catalogName = spark_catalog, implementation = org.apache.iceberg.spark.SparkSessionCatalog, config = {type=hive, default-namespace=default, parquet-enabled=true, cache-enabled=false}] FAILED
    java.lang.NoClassDefFoundError at SmokeTest.java:67
[shutdown-hook-0] INFO org.apache.spark.util.ShutdownHookManager - Shutdown hook called
[shutdown-hook-0] INFO org.apache.spark.util.ShutdownHookManager - Deleting directory /tmp/spark-8734a907-a0d1-4022-88d4-454d931f55d8

9 tests completed, 6 failed

nastra avatar Sep 21 '22 16:09 nastra

Rather than running these along with unit tests, what about adding a configuration to run them separately? Would that be annoying to have so many checks? It would hopefully make CI faster by running in parallel.

rdblue avatar Sep 26 '22 16:09 rdblue

Rather than running these along with unit tests, what about adding a configuration to run them separately? Would that be annoying to have so many checks? It would hopefully make CI faster by running in parallel.

That might make sense as well and I would defer to @aokolnychyi / @RussellSpitzer here. However, the current integration tests that we have are fairly lightweight and take about 10 seconds to run

nastra avatar Sep 27 '22 07:09 nastra

10 seconds sounds reasonable to me. No need to move them out, then.

rdblue avatar Sep 29 '22 20:09 rdblue