xiaofan2012
xiaofan2012
So how can this be optimized? This speed is too slow
hdfs dfs -cat hdfs://nameservice1/apps/spark/warehouse/test.db/file_test/.hoodie/.bucket_index/consistent_hashing_metadata/00000000000000.hashing_meta | grep "value" | wc -l result=>>>256
Yes, I set up 'hoodie. Bucket. Index. Max. Num. Buckets' =' 32 ', 'the hoodie. Bucket. Index. Min. Num. Buckets' =' 4 ', but found that there are still 256...
I first create tables through spark and import full data. Then flink updates incremental data in real time, but the default bucket in spark is 4