Robert (Bobby) Evans
Robert (Bobby) Evans
This appears to be coming form https://github.com/apache/spark/blob/fd86f85e181fc2dc0f50a096855acf83a6cc5d9c/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TableOutputResolver.scala#L381-L421 It appears that https://issues.apache.org/jira/browse/SPARK-42151 https://github.com/apache/spark/pull/40308 So technically this is a regression, more accurately a performance regression, in that we could run the query...
@sameerz and @mattahrens we now know why the regression has happened and we need to decide what the next steps are. Implementing this is not too difficult. We mainly need...
I spoke with @jlowe and I think we really want to understand this better. https://github.com/NVIDIA/spark-rapids/issues/11649 The problem is that if a retry happens and it is not in a checkpoint/restore,...
@res-life are you still planning on working on this? The failures are happening in two places. If you don't provide a schema, then schema discovery returns with an empty schema....
The plan is explicitly to not support collated strings and to fallback to the CPU if we see them. This is a very large amount of work to try and...
On the GPU the problems typically show up around thread divergence and non-coalesed memory access patterns. I am not 100% sure about this so we should run some experiments and...
Would probably need a new kernel for this, but it is just taking a long and outputting the binary representation of it, which should be dead simple to do.
@LIN-Yu-Ting Generally we treat GPU OutOfMemory errors as bugs that need to be fixed. There are a few cases where an algorithm cannot be split up into smaller pieces and...
Thanks for the updated information. We will try and reproduce this locally and see what we can come up with. For now I think I will just move this over...
If I try to run `get_json_object_multiple_paths2` with just `$` as the path, which is the same as an empty parsed path vector I get what looks like I get the...