Xiangrui Meng

Results 22 comments of Xiangrui Meng

@jhseu If we do not plan to make a new release that is 2.4 compatible, shall we review and merge this PR?

@jhseu Could you take a look? Thanks!

@WeichenXu123 Could you make a pass on the implementation?

Instead of checking whether it is on Databricks, we should just test if this method exists.

Efficient conversion requires Scala UDFs. Maybe we should add utility methods to Spark so in petastorm we can do the following: ~~~python from pyspark.ml.functions import vector_to_dense_array df.select(vector_to_dense_array(col("features")).alias("features")) ~~~ This approach...

FYI. The UDF was merged into Spark master: https://github.com/apache/spark/pull/26910

@simonson-jack Could you provide an example where the PySpark API is cumbersome to use but English could be much simpler?