Alex Baden comments

Results 70 comments of


                                            Alex Baden

[Perf][Bench] Join is slow on big tables.

Support for multi-fragment joins has always been poor. There is multi-fragment join hash table construction, but the references from the hash table to the actual data are 0-indexed and therefore...

device-linear-multifrag execution mode

A few thoughts... On GPU execution, we already support multi-fragment kernels. Therefore, we could have arbitrary sized fragments (yes there is some outer loop overhead but that seems negligible, and...

device-linear-multifrag execution mode

> > In relation to the idea: if it has to be sent to GPU anyways and we know the total size of what we are sending, we can assemble...

device-linear-multifrag execution mode

> Yes, this is the goal. An additional challenge here is to use zero-copy fetch as much as possible, i. e. follow arrow chunks for imported tables, or ResultSets when...

device-linear-multifrag execution mode

I don't know why `launchGpuCode` would be slower with more fragments than with fewer on the same data. The join column iterator is all pre-processing, so that should not effect...

pyhdk import fails on the HDK 0.9 from conda

I can confirm updating to openjdk 17 fixes this issue. So, we need to modify conda-forge to pin openjdk >= 17 instead of 11. I will open the conda-forge PR...

pyhdk import fails on the HDK 0.9 from conda

https://github.com/conda-forge/hdk-feedstock/pull/54 @Garra1980 @leshikus please take a look.

Do not include HDK artifacts in Conda Env Cache

I switched to manually restoring/saving the cache instead of letting the action handle it. So far it seems to be working. I did notice the save cache prints this warning...

Do not include HDK artifacts in Conda Env Cache

With https://github.com/intel-ai/hdk/pull/416 I have included a change to manually restore and save the cache prior to installing pyhdk libraries, only for the PyHDK pytest job. I have run several times...

[CI] Reproducibility of test failure

Is there a specific failure that has been hard to reproduce? Other than the conda-forge build problems or differences in packages across the CI environments we currently test in, I...