spark-rapids icon indicating copy to clipboard operation
spark-rapids copied to clipboard

Fix tests failures in csv_test.py

Open razajafri opened this issue 1 year ago • 3 comments

FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_basic_csv_read
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_csv_fallback
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_csv_read_count
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_csv_read_small_floats
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_date_formats_round_trip
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_input_meta
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_read_case_col_name
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_read_valid_and_invalid_dates
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_round_trip
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_round_trip_for_interval
FAILED ../../../../integration_tests/src/main/python/csv_test.py::test_ts_formats_round_trip

razajafri avatar Jun 08 '24 05:06 razajafri

The following tests fail with a missing read metric:

Caused by: java.util.NoSuchElementException: key not found: bufferTime
        at scala.collection.immutable.Map$EmptyMap$.apply(Map.scala:243)
        at scala.collection.immutable.Map$EmptyMap$.apply(Map.scala:239)
        at com.nvidia.spark.rapids.GpuTextBasedPartitionReader.readToTable(GpuTextBasedPartitionReader.scala:466)
  1. test_basic_csv_read
  2. test_csv_read_small_floats
  3. test_date_formats_round_trip
  4. test_input_meta
  5. test_read_case_col_name
  6. test_read_valid_and_invalid_dates
  7. test_round_trip
  8. test_round_trip_for_interval
  9. test_ts_formats_round_trip

At least one test failed with an unexpected Spark exception message:

E       AssertionError: Expected error 'DateTimeException' did not appear in 'pyspark.errors.exceptions.captured.SparkUpgradeException: [INCONSISTENT_BEHAVIOR_CROSS_VERSI
ON.PARSE_DATETIME_BY_NEW_PARSER] You may get a different result due to the upgrading to Spark >= 3.0:
E       Fail to parse '2020-50-16' in the new parser.
E       You can set "spark.sql.legacy.timeParserPolicy" to "LEGACY" to restore the behavior before Spark 3.0, or set to "CORRECTED" and treat it as an invalid datetime st
ring. SQLSTATE: 42K0B'
  1. test_read_valid_and_invalid_dates

mythrocks avatar Jun 13 '24 00:06 mythrocks

I've taken myself off as the assignee. Beyond the triage, I'm not sure I'll be working on this in the short term.

mythrocks avatar Jul 18 '24 22:07 mythrocks

After setting the ANSI mode to false the following tests fail

test_basic_csv_read (key not found: bufferTime)
test_csv_read_small_floats (key not found: bufferTime)
test_date_formats_round_trip (key not found: bufferTime)
test_input_meta (key not found: bufferTime)
test_read_case_col_name (key not found: bufferTime)
test_read_valid_and_invalid_dates (Correct Exception not thrown)
test_round_trip (key not found: bufferTime)
test_round_trip_for_interval (key not found: bufferTime)
test_ts_formats_round_trip (key not found: bufferTime)

razajafri avatar Aug 14 '24 22:08 razajafri