Tom Ebergen issues

Results 22 issues of


                                            Tom Ebergen

Swap build side and probe side base on cardinality AND width of build side.

If you have a wide build table and skinny probe table, it's possible it's cheaper to build on the skinny probe. I have a micro benchmark to test this. There...

Update reduce_sql script to have a max time

Fixes https://github.com/duckdblabs/duckdb-internal/issues/2434

No mark to semi join if mark index is in projection

closes https://github.com/duckdb/duckdb/issues/11042 The optimization to convert mark joins to semi joins should not always happen. If a mark join index is in a projection operator from before the mark join,...

Fix polars join script

For some reason writing the ipc format for the queries causes the join benchmark 5GB+ to hang. Removing this should get the results back https://github.com/duckdblabs/db-benchmark/blob/269a9a77b950b36c7c812c57889aa41f23c0b98a/polars/join-polars.py#L56 the lines in question

Add estimated cardinality to logical plans

Also helps to verify a fix for https://github.com/duckdblabs/duckdb-internal/issues/2723 Before: ``` ┌─────────────┴─────────────┐ │ COMPARISON_JOIN │ │ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ │ │ INNER ├──────────────┐...

Implement PullUp Empty Results optimizer

Currently this logic is sprinkled within filter pull up and filter push down. The problem is it's not at every operator, and adding empty results across multiple files within the...

feature

power operation on aggregate result uses dplyr fallback

There are aggregation functions that are available in DuckDB, but duckplyr still falls back to dplyr. discovered when benchmarking duckplyr with the db-benchmark. This example comes from group by query...

Add duckplyr

This PR is a WIP PR to add duckplyr. Things that need to be worked out, group by queries: Q8: Query = `ans% select(id6, largest2_v3=v3) %>% filter(!is.na(largest2_v3)) %>% arrange(desc(largest2_v3)) %>%...

update julia versions

Sampling respects seed from random number generator if no seed is given.

fixes https://github.com/duckdblabs/duckdb-internal/issues/3268

CI Failure