Liu Zhao

Results 13 issues of Liu Zhao

## Proposed changes Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request. If it fixes a bug or resolves...

## Proposed changes Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request. If it fixes a bug or resolves...

## Proposed changes Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request. If it fixes a bug or resolves...

## Proposed changes log print format ## Branch to commit - [ ] Branch **kylin3** for v2.x to v3.x - [x] Branch **kylin4** for v4.x - [ ] Branch **kylin5**...

## Proposed changes 在 kylin4 的StorageCleanupJob 中增加对无引用的 cube_stataistics 数据清理,我认为是有意义的: 1. 可以降低无用数据占用的存储空间,同时避免过多无用小文件对nn的压力 2. 默认情况下清理无引用的 cube_stataistics 数据,但可以通过 -cleanupCubeStatistics false 禁用 ## Branch to commit - [ ] Branch **kylin3** for v2.x to...

## Proposed changes 1.初始发现:线上告警某节点存在大量的CLOSE_WAIT,通过 netstat -anp 发现来自于Kylin4 JobServer 进程,CLOSE_WAIT数达到9000多。并且 CLOSE_WAIT 来自的外部地址端口都是 50010,而该端口是 Hadoop DataNode 数据传输使用,故此怀疑是 JobServer在每次作业构建时 fileSystem.open() 一个流后没有进行close。 2.模拟复现:在研测环境提交cube构建任务,并观察 CLOSE_WAIT 数及增长情况,发现每次cube构建结束后,CLOSE_WAIT 数增加1,至此可以确定是JobServer代码中未关闭流导致。 3.定位代码:深入kylin4 构建代码进行debug,最终定位到 org.apache.kylin.engine.spark.utils.UpdateMetadataUtil#syncLocalMetadataToRemote 94行 (Apache Kylin main分支) String...

## Proposed changes 发现kylin4在build job后都会重新new Broadcaster,Broadcaster构造器中会创建一个线程池然后执行任务不停的take广播事件。这个会导致每次构建完job都会创建线程,同时线程也会阻塞在takeFirst处,时间推移服务会不可用。 ![image](https://user-images.githubusercontent.com/49258176/235129598-dbc24825-f31b-4721-9fbb-77c469e4c878.png) ![image](https://user-images.githubusercontent.com/49258176/235129623-fabbb864-5bb0-4cde-8642-c9885fda25a8.png) ![image](https://user-images.githubusercontent.com/49258176/235129647-f379ab02-8e33-40aa-bdcd-b8ef1794dbf9.png) ![image](https://user-images.githubusercontent.com/49258176/235129674-a68b43b8-a58e-4f53-8544-cd005a557b4b.png) ![image](https://user-images.githubusercontent.com/49258176/235129691-d29c6cf1-96b2-4b01-bdb3-4d2eda30d967.png) ## Branch to commit - [ ] Branch **kylin3** for v2.x to v3.x - [x] Branch **kylin4** for v4.x...

## Proposed changes 创建model时在partition部分指定了天分区和时分区,格式分别为:yyyy-MM-dd、HH,在根据天分区值查询时结果非预期。原因是这种场景下segment裁剪有bug。 --不加天分区过滤得到的结果(18号3条数据,19号5条数据) ![1e9077f923e57f65e3712b75d639a31](https://user-images.githubusercontent.com/49258176/208371800-d8d86f55-d342-47d1-962a-f668d759dd65.png) --加天分区在修复前查询结果 --18号实际有3条数据 ![6bbf3bf08dea6b5a269d57c65b4a1b7](https://user-images.githubusercontent.com/49258176/208371938-f4977bbf-678b-460e-9b7b-ff6443110350.png) --19号实际有5条数据 ![e4046323d0a7a8521dae904c5030f26](https://user-images.githubusercontent.com/49258176/208371954-2f5a0a03-b181-4ded-a2c1-4a1b05b58f82.png) --18号19号共有8条数据 ![6858ba031d6d44b050edc76a6e96251](https://user-images.githubusercontent.com/49258176/208371975-6e64407d-0b40-49f5-bc48-d16234e45647.png) --加天分区在修复后查询结果 ![7f372221624b83ff4ad29023c86d373](https://user-images.githubusercontent.com/49258176/208372022-d51d7b49-1348-40ca-a5cc-8ead7b8ab068.png) ![38abe82bacc6c48a8b2ee8d1b31a180](https://user-images.githubusercontent.com/49258176/208372047-9d3b3364-9b99-47f5-9b46-060eace9177c.png) ![559a5f4b672c8d2521cb7f68fd88f72](https://user-images.githubusercontent.com/49258176/208372063-91cac6fd-2da2-4fc8-969f-3e1314164b4d.png) ## Branch to commit - [ ] Branch **kylin3** for v2.x to v3.x...

## Proposed changes 对已修复的bug,选用更合理的实现方式. ## Branch to commit - [ ] Branch **kylin3** for v2.x to v3.x - [x] Branch **kylin4** for v4.x - [ ] Branch **kylin5** for v5.x...

## Proposed changes statement "select name, replace(name, substring(name, 1, 1), '--') as new_name from LZ_TEST_YUFA " query failed, errorMsg:java.lang.ClassCastException: org.apache.spark.sql.Column cannot be cast to java.lang.String while executing SQL: "select *...