delta icon indicating copy to clipboard operation
delta copied to clipboard

[BUG] ActionSerializerSuite has 3 failing unit tests in the master branch

Open ganeshchand opened this issue 3 years ago • 5 comments

Bug

Describe the problem

Following unit tests fail:

  • Metadata (with all defaults)
  • Metadata - json serialization/deserialization
  • Metadata with empty createdTime- json serialization/deserialization

Steps to reproduce

git checkout master
build/sbt "testOnly org.apache.spark.sql.delta.ActionSerializerSuite"
 Metadata (with all defaults) - json serialization/deserialization *** FAILED *** (623 milliseconds)
[info]   "{"metaData":{"id":"[10daabbc-d843-4fe4-abe5-0ccb83ba681e]","format":{"provide..." did not equal "{"metaData":{"id":"[testId]","format":{"provide..." (ActionSerializerSuite.scala:344)
[info]   Analysis:
[info]   "{"metaData":{"id":"[10daabbc-d843-4fe4-abe5-0ccb83ba681e]","format":{"provide..." -> "{"metaData":{"id":"[testId]","format":{"provide..."

============================================================
[info] Tests: succeeded 25, failed 3, canceled 0, ignored 0, pending 0
[info] *** 3 TESTS FAILED ***
[error] Failed tests:
[error]         org.apache.spark.sql.delta.ActionSerializerSuite
[error] (core / Test / testOnly) sbt.TestsFailedException: Tests unsuccessful
[error] Total time: 24 s, completed Jul 21, 2022 9:56:50 AM
➜  delta git:(master)$ build/sbt "testOnly org.apache.spark.sql.delta.ActionSerializerSuite"

Observed results

Expected results

Further details

Environment information

  • Delta Lake version:
  • Spark version:
  • Scala version:

Willingness to contribute

The Delta Lake Community encourages bug fix contributions. Would you or another member of your organization be willing to contribute a fix for this bug to the Delta Lake code base?

  • [ ] Yes. I can contribute a fix for this bug independently.
  • [ ] Yes. I would be willing to contribute a fix for this bug with guidance from the Delta Lake community.
  • [x] No. I cannot contribute a bug fix at this time.

ganeshchand avatar Jul 21 '22 17:07 ganeshchand

Hmm . is there any way we can set the value for commit id before commit ?. From what i am able to gather it's failing because id is not matching but i think id is random value (most probably SHA-1) So, if we can forcefully set it's value only then we can write a test for it .

sherlockbeard avatar Jul 21 '22 21:07 sherlockbeard

The id for metadata is build taking Utils.isTesting into account. Check this: https://github.com/delta-io/delta/blob/2871e550eabd9e4956aca1b02baeb20635a66931/core/src/main/scala/org/apache/spark/sql/delta/actions/actions.scala#L439

The isTesting is basically this: System.getenv("SPARK_TESTING") != null || System.getProperty(IS_TESTING.key) != null

With no extra environmental variables, the 3 tests are failing exactly as you described. If I execute export SPARK_TESTING=true and then I run the tests, they all pass.

@ganeshchand should we close this?

bogdanmrusu avatar Jul 26 '22 15:07 bogdanmrusu

@bogdanmrusu do we know why running on GitHub Actions doesn't hit the issue?

zsxwing avatar Jul 26 '22 16:07 zsxwing

Hey @zsxwing, I checked the scripts that are executed when we run the delta scripts and I couldn't see any env variables being set up. I am still ramping up and not sure where else to look. Would you advise trying to look somewhere else?

bogdanmrusu avatar Jul 28 '22 11:07 bogdanmrusu

Just took a look. The testing flag is set in beforeAll: https://github.com/apache/spark/blob/v3.3.0/core/src/test/scala/org/apache/spark/SparkFunSuite.scala#L86 However, the Metadata object is created before beforeAll here: https://github.com/delta-io/delta/blob/e65496665fd1b187b488ab77f79986a8b381ee16/core/src/test/scala/org/apache/spark/sql/delta/ActionSerializerSuite.scala#L218

That's why it fails.

It doesn't fail on GitHub Actions because GitHub Actions run all of test suites. If there is any test suite running before ActionSerializerSuite, the testing flag will be set before running ActionSerializerSuite. That's why we didn't notice this issue.

The fix would be just making Metadata creation lazy so that it will be created after beforeAll. Is Anyone interested in trying to fix it?

zsxwing avatar Jul 29 '22 21:07 zsxwing

@zsxwing I can try to fix it. Please assign it to me.

bogdanmrusu avatar Aug 01 '22 06:08 bogdanmrusu

@bogdanmrusu Cool. Assigned it to you.

zsxwing avatar Aug 02 '22 17:08 zsxwing

Running into this problem too. Using lazy val did not work for me.

scottsand-db avatar Aug 04 '22 22:08 scottsand-db

@scottsand-db could you review all of Metadata creation and make sure they are created in test body rather than outside?

zsxwing avatar Aug 04 '22 22:08 zsxwing