Raunaq Morarka
Raunaq Morarka
``` # ClickHouse, Druid, MariaDb, MySql, Oracle, PostgreSQL, Redshift, SingleStore, SQL server, Phoenix * Improved performance for queries with selective joins through push down of dynamic filters to the data...
``` # Hive * Support hive bucket filtering on bucketed columns of float, double, date, list, map and bounded varchar data types. ({issue}`13553`) ``` #13553 #13472
``` # Hive * Upgrade Alluxio to 2.8.1 to fix security vulnerabilities. ({issue}`13609`) ``` #13609
@sopel39 PTAL It lgtm % comments about docs
> Test failures are unrelated Please rebase to latest mater, the CI issues should be resolved now
> 1. there seems to ba a lot of code copied between orc, rcfile and parquet write validation. It would be a lot cleaner to have it extracted to common...
[Optimized parquet writer verification inserts benchmark.pdf](https://github.com/trinodb/trino/files/9549824/Optimized.parquet.writer.verification.inserts.benchmark.pdf) Perf impact with 5% verification (current default) is around 2-3% Perf impact with 100% verification would be around 45%.
> Per [#14047 (comment)](https://github.com/trinodb/trino/issues/14047#issuecomment-1244866545) > is this enabled in Hive connector only, and Iceberg/Delta (which also use the optimizer writer), do not run the verification? Right, this PR implements parquet...
> > The bloom_filter_offset in thrift specified the "offset" of the bloomfilter header, but it does not specify the "length" of the > > Since you don't know length, how...
> @raunaqmorarka is there some performance penalty when doing streaming read APIs? There is a description of the problems encountered with streaming reads in https://trino.io/blog/2019/05/06/faster-s3-reads.html