db-benchmark icon indicating copy to clipboard operation
db-benchmark copied to clipboard

large data joins using collapse?

Open SebKrantz opened this issue 10 months ago • 4 comments

Hi, I just wanted to ask why you took collapse out of the 50Gb join benchmark, given that it had initially successfully completed it? collapse has a single-threaded implementation of joins which is very memory efficient. It should be able to complete the task.

SebKrantz avatar Mar 02 '25 17:03 SebKrantz

Oops, sorry about that. I will test it again

Tmonster avatar Jun 16 '25 15:06 Tmonster

Please use v2.1.2 - there was some missing garbage collection in earlier versions which may have led to memory overconsumption.

SebKrantz avatar Jun 16 '25 18:06 SebKrantz

Latest results will use v2.1.2 You can see here. https://github.com/duckdblabs/db-benchmark/pull/126/files

Also running the large join on collapse, hopefully can have the results up later today

Tmonster avatar Jun 18 '25 08:06 Tmonster

Hi @SebKrantz,

So I've run the benchmark for collapse a couple of times on the large machine large dataset. Collapse @v2.1.2 seems to fail each time during ingestion. I know the benchmark page says "not tested" but I will update that soon. I have other priorities now, so can't spend too much time debugging the issue, but feel free to request another run or open a PR if you have more information as to what might be the problem

Tmonster avatar Jun 20 '25 11:06 Tmonster