spark-rapids
spark-rapids copied to clipboard
[FEA] look at ways to use replace_nans and replace_nulls from cudf
As a part of the discussion at https://github.com/NVIDIA/spark-rapids/issues/6164#issuecomment-1210095670 we saw that there is a replace_nans function in cudf, along with a replace_nulls one. It might be good for us to try and play around with using these for things like with the coalesce expression or pattern matching for if(isNull(a), b, a) or something similar. I have not seen too many of these types of operations show up in any traces, but it would be an interesting first step to start to explore the framework to do other types of replacements in the future.
If we want to add JNIs for them, should we put them in cudf or spark-rapids-jni?
If we want to add JNIs for them, should we put them in cudf or spark-rapids-jni? cudf. These are cudf APIs and we are not putting anything in that is super Spark specific.