hive icon indicating copy to clipboard operation
hive copied to clipboard

fix: Ensure consistent test behavior with NonDex by resolving nondeterministic field order

Open RITIKHARIANI opened this issue 11 months ago • 5 comments

What changes were proposed in this pull request?

This pull request addresses non-deterministic behavior in the TestLazyBinaryColumnarSerDe#testLazyBinaryColumnarSerDeWithEmptyBinary test and other related tests. The root cause of the intermittent failures was identified in ObjectInspectorUtils.java, where the method getDeclaredFields() was used. The getDeclaredFields() method returns fields in an unordered manner, which can cause inconsistent results across different JVM versions and vendors. This PR resolves the issue by sorting the fields retrieved by getDeclaredFields() to ensure consistent behavior across JVMs.

Why are the changes needed?

The getDeclaredFields() method does not guarantee a specific order in the returned fields, which can vary depending on the JVM version and vendor. This non-deterministic behavior caused test cases, such as TestLazyBinaryColumnarSerDe#testLazyBinaryColumnarSerDeWithEmptyBinary, to fail intermittently. By sorting the fields, we ensure deterministic behavior, thus resolving the root cause of these test failures.

Does this PR introduce any user-facing change?

No, this PR does not introduce any user-facing changes. The changes are internal, aimed at stabilizing the behavior of the code and ensuring that tests pass consistently across environments.

Is the change a dependency upgrade?

No, this PR does not involve any dependency upgrades. It focuses solely on addressing the non-deterministic behavior in the current codebase.

How was this patch tested?

The patch was tested by running the affected test cases, particularly TestLazyBinaryColumnarSerDe#testLazyBinaryColumnarSerDeWithEmptyBinary, using the NonDex tool. This tool helps detect and debug issues with under-determined Java APIs. The sorting of fields was confirmed to prevent non-deterministic behavior, and the tests were verified to pass consistently.

To replicate the testing steps:

  1. Clone the repository and run mvn install.
  2. Navigate to the hive/serde directory.
  3. Run the following command to test the patch:
cd hive/serde
mvn edu.illinois:nondex-maven-plugin:2.1.7:nondex -Dtest=TestLazyBinaryColumnarSerDe#testLazyBinaryColumnarSerDeWithEmptyBinary -DnondexMode=FULL -DnondexRuns=10

RITIKHARIANI avatar Apr 11 '25 18:04 RITIKHARIANI

I would consider running a flaky test Jenkins job for determining whether the fix will resolve the issue. Please consider using this Jenkins job - https://ci.hive.apache.org/blue/organizations/jenkins/hive-flaky-check/activity

@ayushtkn @zabetak @deniskuzZ I dont seem to have access to retrigger / run this flaky test job as well. Is there a way I can get access to it.

SourabhBadhya avatar Apr 12 '25 04:04 SourabhBadhya

Hey @SourabhBadhya , I just gave you access to trigger jobs in hiveci. Currently, permissions are granted per request. For more info please see https://issues.apache.org/jira/browse/HIVE-28641

zabetak avatar Apr 14 '25 05:04 zabetak

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.

github-actions[bot] avatar Jun 14 '25 00:06 github-actions[bot]

re-triggered the build

deniskuzZ avatar Jun 14 '25 14:06 deniskuzZ

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the [email protected] list if the patch is in need of reviews.

github-actions[bot] avatar Aug 14 '25 00:08 github-actions[bot]