pdlovedy comments

Results 3 comments of


                                            pdlovedy

Documentation Enhancement

在waterdrop对于es的读取中（es作为input插件）[https://interestinglab.github.io/waterdrop-docs/#/zh-cn/v1/configuration/input-plugins/Elasticsearch]，加上对es configuration参数的说明:[https://www.elastic.co/guide/en/elasticsearch/hadoop/6.2/configuration.html]，用以最大化waterdrop读取es的效率，该参数为：es.input.max.docs.per.partition的值如下：分区数 = 总数据条数/es.input.max.docs.per.partition。让用户选择合适的分区数，用以处理可能数据量较大时带来的shuffle过长使得es与其他组件迁移速率低下的问题。也是通过控制读取es的分区数来加快shuffle的过程，添加说明可以使得用户更加方便高效。这一点与es的官网对于读取es的分片所应用的cpu核数(线程数)的建议有点不一样，需要实践才能知道合适的大小。并且通过测试这个参数设置的合理，读取es的效率可以提升3-10倍。

[Bug] [spark.source.FakeStream] runtime error->ConfigObject is immutable, you can't call Map.put

> @tmljob i have the same problem，do you solve it? me too,st2.1.3 spark2.4.8

could not determine kind of name for C.taos_schemaless_insert_raw

和demo校对了下，是driver-go的版本问题，镜像是最新的3.0.1.6，但是必须是这个driver github.com/taosdata/driver-go/v3 v3.0.0