incubator-gluten icon indicating copy to clipboard operation
incubator-gluten copied to clipboard

[CORE] Change domain name to org.apache.gluten

Open yma11 opened this issue 10 months ago • 16 comments

What changes were proposed in this pull request?

Change gluten domain name to org.apache.gluten. User needs to note the plugin configuration is affected.

--conf spark.plugins=org.apache.gluten.GlutenPlugin

How was this patch tested?

CI

yma11 avatar Mar 28 '24 11:03 yma11

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

github-actions[bot] avatar Mar 28 '24 11:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 28 '24 11:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 28 '24 11:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 28 '24 13:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 28 '24 14:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 01:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 05:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 05:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 06:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 06:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 07:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 08:03 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Mar 29 '24 10:03 github-actions[bot]

Hi @zzcclp, could you please help check why CH CI fails? Thanks!

PHILO-HE avatar Mar 29 '24 12:03 PHILO-HE

@zzcclp @baibaichen Here's the error log, is there some hard-code function signature used?

20:03:21  Caused by: org.apache.gluten.exception.GlutenException: Unknown storage policy `__hdfs_main`
20:08:35  Caused by: org.apache.gluten.exception.GlutenException: Unexpected value for start: while executing 'FUNCTION throwIf(equals(CAST(n_regionkey,Nullable(I_2),0_4) : 4, Unexpected value for start_5 :: 7) -> throwIf(equals(CAST(n_regionkey,Nullable(I_2),0_4),Unexpected value for start_5) Nullable(UInt8) : 5'
20:08:35  0. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Common/Exception.cpp:96: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000b1546bb
20:08:35  1. DB::Exception::createRuntime(int, String const&) @ 0x000000000a4cec2d
20:08:35  2. DB::(anonymous namespace)::FunctionThrowIf::executeImpl(std::vector<DB::ColumnWithTypeAndName, std::allocator<DB::ColumnWithTypeAndName>> const&, std::shared_ptr<DB::IDataType const> const&, unsigned long) const @ 0x000000000a4cda59
20:08:35  3. DB::FunctionToExecutableFunctionAdaptor::executeImpl(std::vector<DB::ColumnWithTypeAndName, std::allocator<DB::ColumnWithTypeAndName>> const&, std::shared_ptr<DB::IDataType const> const&, unsigned long) const @ 0x0000000005d2cdee
20:08:35  4. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Functions/IFunction.cpp:0: DB::IExecutableFunction::executeWithoutLowCardinalityColumns(std::vector<DB::ColumnWithTypeAndName, std::allocator<DB::ColumnWithTypeAndName>> const&, std::shared_ptr<DB::IDataType const> const&, unsigned long, bool) const @ 0x000000000d96c468
20:08:35  5. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Functions/IFunction.cpp:0: DB::IExecutableFunction::executeWithoutLowCardinalityColumns(std::vector<DB::ColumnWithTypeAndName, std::allocator<DB::ColumnWithTypeAndName>> const&, std::shared_ptr<DB::IDataType const> const&, unsigned long, bool) const @ 0x000000000d96c6cd
20:08:35  6. ../ClickHouse/contrib/boost/boost/smart_ptr/intrusive_ptr.hpp:117: DB::IExecutableFunction::executeWithoutSparseColumns(std::vector<DB::ColumnWithTypeAndName, std::allocator<DB::ColumnWithTypeAndName>> const&, std::shared_ptr<DB::IDataType const> const&, unsigned long, bool) const @ 0x000000000d96d122
20:08:35  7. ../ClickHouse/contrib/llvm-project/libcxx/include/vector:434: DB::IExecutableFunction::execute(std::vector<DB::ColumnWithTypeAndName, std::allocator<DB::ColumnWithTypeAndName>> const&, std::shared_ptr<DB::IDataType const> const&, unsigned long, bool) const @ 0x000000000d96e6fb
20:08:35  8. ../ClickHouse/contrib/boost/boost/smart_ptr/intrusive_ptr.hpp:117: DB::ExpressionActions::execute(DB::Block&, unsigned long&, bool, bool) const @ 0x000000000e380828
20:08:35  9. ../ClickHouse/contrib/llvm-project/libcxx/include/vector:537: DB::ExpressionTransform::transform(DB::Chunk&) @ 0x000000000f63e896
20:08:35  10. ../ClickHouse/contrib/llvm-project/libcxx/include/__utility/swap.h:36: DB::ISimpleTransform::work() @ 0x000000000f5530ee
20:08:35  11. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Processors/Executors/ExecutionThreadContext.cpp:0: DB::ExecutionThreadContext::executeTask() @ 0x000000000f570bc4
20:08:35  12. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Processors/Executors/PipelineExecutor.cpp:273: DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x000000000f565930
20:08:35  13. ../ClickHouse/contrib/llvm-project/libcxx/include/atomic:958: DB::PipelineExecutor::executeStep(std::atomic<bool>*) @ 0x000000000f565348
20:08:35  14. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Processors/Executors/PullingPipelineExecutor.cpp:54: DB::PullingPipelineExecutor::pull(DB::Chunk&) @ 0x000000000f575397
20:08:35  15. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Processors/Executors/PullingPipelineExecutor.cpp:65: DB::PullingPipelineExecutor::pull(DB::Block&) @ 0x000000000f575553
20:08:35  16. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../local-engine/Parser/SerializedPlanParser.cpp:0: local_engine::LocalExecutor::hasNext() @ 0x000000000b4bbe94
20:08:35  17. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../local-engine/local_engine_jni.cpp:333: Java_org_apache_gluten_vectorized_BatchIterator_nativeHasNext @ 0x0000000005be47f7
20:08:35  

zhouyuan avatar Mar 29 '24 13:03 zhouyuan

and below error messages also shown:

[2024-03-29T12:02:49.115Z] Caused by: org.apache.gluten.exception.GlutenException: Unknown storage policy `__s3_main`
[2024-03-29T12:02:49.115Z] 0. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Common/Exception.cpp:96: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000b1546bb
[2024-03-29T12:02:49.115Z] 1. ../ClickHouse/contrib/llvm-project/libcxx/include/string:1499: DB::Exception::Exception<String>(int, FormatStringHelperImpl<std::type_identity<String>::type>, String&&) @ 0x000000000584c9a3
[2024-03-29T12:02:49.115Z] 2. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Disks/StoragePolicy.cpp:0: DB::StoragePolicySelector::get(String const&) const @ 0x000000000e2ecb2b
[2024-03-29T12:02:49.115Z] 3. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Interpreters/Context.cpp:0: DB::Context::getStoragePolicy(String const&) const @ 0x000000000e2bf582
[2024-03-29T12:02:49.115Z] 4. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Storages/MergeTree/MergeTreeData.cpp:0: DB::MergeTreeData::getStoragePolicy() const @ 0x000000000f2ffe8b
[2024-03-29T12:02:49.115Z] 5. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../local-engine/Storages/Mergetree/SparkMergeTreeWriter.cpp:183: local_engine::SparkMergeTreeWriter::writeTempPart(DB::BlockWithPartition&, std::shared_ptr<DB::StorageInMemoryMetadata const> const&) @ 0x000000000b51d4df
[2024-03-29T12:02:49.115Z] 6. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../local-engine/Storages/Mergetree/SparkMergeTreeWriter.cpp:79: local_engine::SparkMergeTreeWriter::finalize() @ 0x000000000b51c720
[2024-03-29T12:02:49.115Z] 7. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../local-engine/local_engine_jni.cpp:0: Java_org_apache_spark_sql_execution_datasources_CHDatasourceJniWrapper_closeMergeTreeWriter @ 0x0000000005bf235f
[2024-03-29T12:02:49.115Z] 
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.CHDatasourceJniWrapper.closeMergeTreeWriter(Native Method)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeOutputWriter.close(MergeTreeOutputWriter.scala:50)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeFileFormatDataWriter.releaseCurrentWriter(MergeTreeFileFormatDataWriter.scala:72)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeBaseDynamicPartitionDataWriter.renewCurrentWriter(MergeTreeFileFormatDataWriter.scala:304)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeDynamicPartitionDataSingleWriter.beforeWrite(MergeTreeFileFormatDataWriter.scala:429)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeDynamicPartitionDataSingleWriter.write(MergeTreeFileFormatDataWriter.scala:449)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeFileFormatDataWriter.writeWithMetrics(MergeTreeFileFormatDataWriter.scala:97)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeFileFormatDataWriter.writeWithIterator(MergeTreeFileFormatDataWriter.scala:104)

yma11 avatar Mar 29 '24 13:03 yma11

Can we merge this pr first ? There are lots of conflicts during developing.

ulysses-you avatar Apr 01 '24 01:04 ulysses-you

Did a offline check with @zzcclp it's an odd issue, will try to disable the two tests on CK to make CI pass first. He will help to follow up on this. Thanks a lot @zzcclp

thanks, -yuan

zhouyuan avatar Apr 01 '24 02:04 zhouyuan

Run Gluten Clickhouse CI

github-actions[bot] avatar Apr 01 '24 02:04 github-actions[bot]

Run Gluten Clickhouse CI

github-actions[bot] avatar Apr 01 '24 02:04 github-actions[bot]

@yma11 please ignore the ut first in these two suites: GlutenClickHouseMergeTreeWriteOnHDFSSuite and GlutenClickHouseMergeTreeWriteOnS3Suite, and raise an issue for us, thanks.

zzcclp avatar Apr 01 '24 04:04 zzcclp

Run Gluten Clickhouse CI

github-actions[bot] avatar Apr 01 '24 05:04 github-actions[bot]

GlutenClickHouseMergeTreeWriteOnHDFSSuite

Thanks! by the way, is following the correct way to disable these suites?

image

yma11 avatar Apr 01 '24 05:04 yma11

and below error messages also shown:

[2024-03-29T12:02:49.115Z] Caused by: org.apache.gluten.exception.GlutenException: Unknown storage policy `__s3_main`
[2024-03-29T12:02:49.115Z] 0. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Common/Exception.cpp:96: DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000b1546bb
[2024-03-29T12:02:49.115Z] 1. ../ClickHouse/contrib/llvm-project/libcxx/include/string:1499: DB::Exception::Exception<String>(int, FormatStringHelperImpl<std::type_identity<String>::type>, String&&) @ 0x000000000584c9a3
[2024-03-29T12:02:49.115Z] 2. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Disks/StoragePolicy.cpp:0: DB::StoragePolicySelector::get(String const&) const @ 0x000000000e2ecb2b
[2024-03-29T12:02:49.115Z] 3. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Interpreters/Context.cpp:0: DB::Context::getStoragePolicy(String const&) const @ 0x000000000e2bf582
[2024-03-29T12:02:49.115Z] 4. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../ClickHouse/src/Storages/MergeTree/MergeTreeData.cpp:0: DB::MergeTreeData::getStoragePolicy() const @ 0x000000000f2ffe8b
[2024-03-29T12:02:49.115Z] 5. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../local-engine/Storages/Mergetree/SparkMergeTreeWriter.cpp:183: local_engine::SparkMergeTreeWriter::writeTempPart(DB::BlockWithPartition&, std::shared_ptr<DB::StorageInMemoryMetadata const> const&) @ 0x000000000b51d4df
[2024-03-29T12:02:49.115Z] 6. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../local-engine/Storages/Mergetree/SparkMergeTreeWriter.cpp:79: local_engine::SparkMergeTreeWriter::finalize() @ 0x000000000b51c720
[2024-03-29T12:02:49.115Z] 7. /home/jenkins/agent/workspace/gluten/gluten-ci/gluten/cpp-ch/build/../local-engine/local_engine_jni.cpp:0: Java_org_apache_spark_sql_execution_datasources_CHDatasourceJniWrapper_closeMergeTreeWriter @ 0x0000000005bf235f
[2024-03-29T12:02:49.115Z] 
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.CHDatasourceJniWrapper.closeMergeTreeWriter(Native Method)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeOutputWriter.close(MergeTreeOutputWriter.scala:50)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeFileFormatDataWriter.releaseCurrentWriter(MergeTreeFileFormatDataWriter.scala:72)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeBaseDynamicPartitionDataWriter.renewCurrentWriter(MergeTreeFileFormatDataWriter.scala:304)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeDynamicPartitionDataSingleWriter.beforeWrite(MergeTreeFileFormatDataWriter.scala:429)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeDynamicPartitionDataSingleWriter.write(MergeTreeFileFormatDataWriter.scala:449)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeFileFormatDataWriter.writeWithMetrics(MergeTreeFileFormatDataWriter.scala:97)
[2024-03-29T12:02:49.115Z] 	at org.apache.spark.sql.execution.datasources.v1.clickhouse.MergeTreeFileFormatDataWriter.writeWithIterator(MergeTreeFileFormatDataWriter.scala:104)

@zzcclp Issue is created for tracking. Thanks!

yma11 avatar Apr 01 '24 05:04 yma11

GlutenClickHouseMergeTreeWriteOnHDFSSuite

Thanks! by the way, is following the correct way to disable these suites?

image

Not this way, I will raise a hotfix pr to fix

zzcclp avatar Apr 01 '24 05:04 zzcclp

Raise pr #5232 to ignore some failed ut, will fix them later.

zzcclp avatar Apr 01 '24 06:04 zzcclp