Liu Zhao
Liu Zhao
## Proposed changes Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request. If it fixes a bug or resolves...
## Proposed changes Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request. If it fixes a bug or resolves...
## Proposed changes Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request. If it fixes a bug or resolves...
## Proposed changes log print format ## Branch to commit - [ ] Branch **kylin3** for v2.x to v3.x - [x] Branch **kylin4** for v4.x - [ ] Branch **kylin5**...
## Proposed changes 在 kylin4 的StorageCleanupJob 中增加对无引用的 cube_stataistics 数据清理,我认为是有意义的: 1. 可以降低无用数据占用的存储空间,同时避免过多无用小文件对nn的压力 2. 默认情况下清理无引用的 cube_stataistics 数据,但可以通过 -cleanupCubeStatistics false 禁用 ## Branch to commit - [ ] Branch **kylin3** for v2.x to...
## Proposed changes 1.初始发现:线上告警某节点存在大量的CLOSE_WAIT,通过 netstat -anp 发现来自于Kylin4 JobServer 进程,CLOSE_WAIT数达到9000多。并且 CLOSE_WAIT 来自的外部地址端口都是 50010,而该端口是 Hadoop DataNode 数据传输使用,故此怀疑是 JobServer在每次作业构建时 fileSystem.open() 一个流后没有进行close。 2.模拟复现:在研测环境提交cube构建任务,并观察 CLOSE_WAIT 数及增长情况,发现每次cube构建结束后,CLOSE_WAIT 数增加1,至此可以确定是JobServer代码中未关闭流导致。 3.定位代码:深入kylin4 构建代码进行debug,最终定位到 org.apache.kylin.engine.spark.utils.UpdateMetadataUtil#syncLocalMetadataToRemote 94行 (Apache Kylin main分支) String...
## Proposed changes 发现kylin4在build job后都会重新new Broadcaster,Broadcaster构造器中会创建一个线程池然后执行任务不停的take广播事件。这个会导致每次构建完job都会创建线程,同时线程也会阻塞在takeFirst处,时间推移服务会不可用。 data:image/s3,"s3://crabby-images/0d019/0d0191061126229412f7fc3dfa40a837fefab881" alt="image" data:image/s3,"s3://crabby-images/a5d26/a5d268e31572f42225c532f533a376569d935467" alt="image" data:image/s3,"s3://crabby-images/0d395/0d3959db4af18d97319a28bc8d0ac91d87f93ec6" alt="image" data:image/s3,"s3://crabby-images/1229e/1229e0ec5c22cc53d11794c18e5b5a6ccd5e138b" alt="image" data:image/s3,"s3://crabby-images/73b71/73b710905a5f666b7175337aae0f9d0bf1d5a58e" alt="image" ## Branch to commit - [ ] Branch **kylin3** for v2.x to v3.x - [x] Branch **kylin4** for v4.x...
## Proposed changes 创建model时在partition部分指定了天分区和时分区,格式分别为:yyyy-MM-dd、HH,在根据天分区值查询时结果非预期。原因是这种场景下segment裁剪有bug。 --不加天分区过滤得到的结果(18号3条数据,19号5条数据) data:image/s3,"s3://crabby-images/e4127/e4127fff08437c99f3d26ed370695a75fd3d71ef" alt="1e9077f923e57f65e3712b75d639a31" --加天分区在修复前查询结果 --18号实际有3条数据 data:image/s3,"s3://crabby-images/bbfb5/bbfb59b8a7c7627de6d1048da44ae897f2d0dede" alt="6bbf3bf08dea6b5a269d57c65b4a1b7" --19号实际有5条数据 data:image/s3,"s3://crabby-images/efd43/efd4342abe6dd3b76ddae7e9d20f2c6cff44c674" alt="e4046323d0a7a8521dae904c5030f26" --18号19号共有8条数据 data:image/s3,"s3://crabby-images/2e4d4/2e4d4cf323caeabdb734078ea01ce661b875debd" alt="6858ba031d6d44b050edc76a6e96251" --加天分区在修复后查询结果 data:image/s3,"s3://crabby-images/48d4d/48d4de3e414eb37c29e77d01f9a65842dd9dcdf0" alt="7f372221624b83ff4ad29023c86d373" data:image/s3,"s3://crabby-images/2f835/2f8355098565a2e2798d72f21d198b5ea06b2c04" alt="38abe82bacc6c48a8b2ee8d1b31a180" data:image/s3,"s3://crabby-images/50b09/50b09d3442c7f2b0613725745041bbf8c19fda21" alt="559a5f4b672c8d2521cb7f68fd88f72" ## Branch to commit - [ ] Branch **kylin3** for v2.x to v3.x...
## Proposed changes 对已修复的bug,选用更合理的实现方式. ## Branch to commit - [ ] Branch **kylin3** for v2.x to v3.x - [x] Branch **kylin4** for v4.x - [ ] Branch **kylin5** for v5.x...
## Proposed changes statement "select name, replace(name, substring(name, 1, 1), '--') as new_name from LZ_TEST_YUFA " query failed, errorMsg:java.lang.ClassCastException: org.apache.spark.sql.Column cannot be cast to java.lang.String while executing SQL: "select *...