hive icon indicating copy to clipboard operation
hive copied to clipboard

HIVE-27791 Eliminate totalSize check from test

Open zratkai opened this issue 1 year ago • 12 comments

Change-Id: I92bf8cf7cbfe2515e2b62a35b6551c25852a8369

What changes were proposed in this pull request?

The totalSize checks are eliminated, since every ORC upgrade causes many tests to fail.

Why are the changes needed?

To avoid modification of tests in case of ORC upgrade.

Does this PR introduce any user-facing change?

Is the change a dependency upgrade?

How was this patch tested?

With unit and qtests.

zratkai avatar Dec 01 '23 12:12 zratkai

do we need to do this replace in all q files? Can't we add it in https://github.com/apache/hive/blob/master/itests/util/src/main/java/org/apache/hadoop/hive/ql/QOutProcessor.java

Agreed, we can refer to HIVE-19612 to mask totalSize for qtests. It is more simple and also can mask the new added qtests.

zhangbutao avatar Dec 04 '23 15:12 zhangbutao

@ayushtkn, @zhangbutao thanks for the review! Modified according to your suggestions. Please have a look on it!

zratkai avatar Dec 05 '23 16:12 zratkai

@ayushtkn, @zhangbutao thanks for the review! Modified according to your suggestions. Please have a look on it!

@zratkai Good! But http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4913/2/tests shows that there are still several qtests that need to be fixed

zhangbutao avatar Dec 06 '23 02:12 zhangbutao

@ayushtkn Any other thoughts about this change?

zhangbutao avatar Dec 11 '23 13:12 zhangbutao

We are explicitily adding MASK_TOTAL_SIZE in every file, can we have it enabled always

  public static final String HDFS_MASK = "### HDFS PATH ###";
  public static final String HDFS_DATE_MASK = "### HDFS DATE ###";
  public static final String HDFS_USER_MASK = "### USER ###";
  public static final String HDFS_GROUP_MASK = "### GROUP ###";

The HDFS Path and all, get masked always, we need not specify for all qFiles, can't we do something similar.

Thanks @ayushtkn for suggestion, yes, this is the most direct way to mask totalSize. @zratkai I'm sorry I gave you an wrong reference(HIVE-19612) which is not a complete solution. Would you mind to refine this PR again? Thanks.

zhangbutao avatar Dec 13 '23 14:12 zhangbutao

@ayushtkn @zhangbutao I rewrote it as you recommended. 99% of file changes are q and q.out file changes, and only 3 java files . Could you please review it?

zratkai avatar Jan 10 '24 15:01 zratkai

I re-run the CI, find this test always failed http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-4913/14/tests Could you please take a look? @zratkai

zhangbutao avatar Jan 11 '24 04:01 zhangbutao

@zhangbutao thanks for taking care of my ticket! I checked and fixed failing test. I hope there won't be any new.

zratkai avatar Jan 11 '24 10:01 zratkai

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar Feb 07 '24 17:02 sonarqubecloud[bot]

@abstractdog could you please review?

zratkai avatar Feb 08 '24 10:02 zratkai

We are explicitily adding MASK_TOTAL_SIZE in every file, can we have it enabled always

  public static final String HDFS_MASK = "### HDFS PATH ###";
  public static final String HDFS_DATE_MASK = "### HDFS DATE ###";
  public static final String HDFS_USER_MASK = "### USER ###";
  public static final String HDFS_GROUP_MASK = "### GROUP ###";

The HDFS Path and all, get masked always, we need not specify for all qFiles, can't we do something similar.

as far as I can see the masking directive has been removed from q files, so I'm suspecting this comment can be resolved, please confirm @ayushtkn

abstractdog avatar Feb 12 '24 08:02 abstractdog

we're supposed to eliminate the one-by-one masking directives and handle this from the QOutProcessor, so this is supposed to return empty right?

grep -iRH "qt:replace.*totalSize" --include="*.q"
iceberg/iceberg-handler/src/test/queries/positive/row_count.q:--! qt:replace:/(\s+totalSize\s+)\S+(\s*)/$1#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_insert_only_clustered.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/materialized_view_create_rewrite_dummy.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/materialized_view_create_rewrite_time_window.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_partitioned_minor.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_insert_only_partitioned_clustered.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_clustered_minor.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_insert_only.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/materialized_view_create_rewrite_4.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_minor.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_insert_only_partitioned_clustered_minor.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/acid_no_buckets.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_partitioned.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/materialized_view_create_rewrite_multi_db.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_clustered.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/materialized_view_create_rewrite_time_window_2.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/default_constraint.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/materialized_view_create_rewrite.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/smb_mapjoin_1.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_insert_only_clustered_minor.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_insert_only_partitioned_minor.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/materialized_view_create_rewrite_by_text_multi_db.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_insert_only_partitioned.q:--! qt:replace:/totalSize=(\d+)/#Masked#/
ql/src/test/queries/clientpositive/deleteAnalyze.q:--! qt:replace:/(totalSize\s+)(\S+|\s+|.+)/$1#Masked#/
ql/src/test/queries/clientpositive/compaction_query_based_insert_only_minor.q:--! qt:replace:/totalSize=(\d+)/#Masked#/

abstractdog avatar Feb 12 '24 10:02 abstractdog

Quality Gate Passed Quality Gate passed

Issues
2 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

sonarqubecloud[bot] avatar Apr 22 '24 10:04 sonarqubecloud[bot]

Any further fixs & coments here? https://github.com/apache/hive/pull/5218 wants to upgrade ORC version, but hits the totalSize issue again. So i think we can continue this PR.

zhangbutao avatar Apr 29 '24 12:04 zhangbutao

@zhangbutao I am on it, the test env were instable, and spent a lot of time on it, to rerun, and fix some issue, but still not green.

zratkai avatar Apr 29 '24 12:04 zratkai