PLASH SPEED

Results 74 comments of PLASH SPEED

@szehon-ho hi. I uploaded the image.(COW running time) When I ran the SQL this time, there were some third-party tasks in the cluster that were competing with me for resources,...

@szehon-ho Yes, the amount of data in the SOURCE table is not the same between the two executions, because some incremental data is coming in every day. I'll adjust it...

@szehon-ho Hello. I have prepared two datasets with the same amount of data and conducted experiments in the same environment. Here are the results of the experiments I collected.(I uploaded...

@szehon-ho The file I provided you with contains the execution details of the COW table. Do you need any additional information from me?

@szehon-ho Sounds good, I'll test it first after v1.3 is released

@szehon-ho I re-tested ICEBERG 1.3.0 on SPARK 3.3. but the problem is still not solved. Is this problem solved in SPARK3.4?

@chenwyi2 The bloom filter may not be as useful as it seems. When the underlying dataset is very large, the bloom filter has a problem with false positives. A proven...

@zhongyujiang In fact, I don't think it seems necessary to manage multiple Locations if it's just to tier cold data with hot data.

If you are using HDFS tiered storage, then, first of all, you can mount the object storage as a virtual physical disc, and then configure the storage policy for this...

If you don't want to do this, then you can also mount the object store to hdfs, through a specific path to read and write to the object store.But if...