spark-rapids icon indicating copy to clipboard operation
spark-rapids copied to clipboard

[FEA] Support parse_url

Open viadea opened this issue 3 years ago • 4 comments

I wish we can support parse_url function.

eg:

spark-sql> select parse_url(c_customer_id,'HOST') from tpcds.customer limit 10;

      ! <ParseUrl> parse_url(c_customer_id#1, HOST, false) cannot run on GPU because GPU does not currently support the operator class org.apache.spark.sql.catalyst.expressions.ParseUrl

viadea avatar Nov 01 '22 21:11 viadea

Discuss with the python cudf team whether they could benefit from a kernel for this functionality in libcudf.

sameerz avatar Nov 02 '22 21:11 sameerz

Is it sufficient for the initial implementation that the 2nd and 3rd parameters are literals?

mattahrens avatar Mar 21 '23 22:03 mattahrens

From the logs, i think it is good enough. I am also double checking with user.

viadea avatar Apr 20 '23 18:04 viadea

The requirement is to support all 3 input parameter for this function including the KEY.

We need to support PATH, QUERY and HOST.

viadea avatar Nov 01 '23 00:11 viadea

Parse_url is now supported for Host with this PR.

hyperbolic2346 avatar May 23 '24 19:05 hyperbolic2346