BInwei Yang comments

Results 167 comments of


                                            BInwei Yang

fix: Start prefetch from the first split in table scan

> > If a partition only has 2 split > > This sounds like a bigger issue, the workload is doomed to be skewed with such a setup No, It's...

fix: Start prefetch from the first split in table scan

> @FelixYBW A split is usually a stripe (row group) if the number of files is not large enough. With one file per split, the split level preloading is not...

fix: Start prefetch from the first split in table scan

> It's usually either 1 split per row group or 1 split per file, how did you generate your data? Hi @Yuhta we can't make any assumption on this in...

fix: Start prefetch from the first split in table scan

> The change itself should be ok; it's just if one split contains many row groups, row group level prefetch would benefit much more than split level prefetch (i.e. the...

fix: Fix smj result mismatch issue in semi, anit and full outer join

@pedroerp can you find someone to review the PR? The PR is essential to Gluten project. It solved a bug Gluten customer observed.

[GLUTEN-7548][VL] Optimize BHJ in velox backend

In long term, we need to implement the Spark way. Broadcast hashtable instead of raw table data.

[GLUTEN-7548][VL] Optimize BHJ in velox backend

@zhztheplayer Is there memory management issue in this solution? Is the memory allocated in storage memory? @JkSelf will this solution helpful to the final solution?