spark-essentials icon indicating copy to clipboard operation
spark-essentials copied to clipboard

Alternative solution proposal

Open syr opened this issue 3 years ago • 0 comments

Instead of 'parse the DF multiple times, then union the small DFs' https://github.com/rockthejvm/spark-essentials/blob/c8ee4b251129dec82bdcd37bbfe62516ad0481f7/src/main/scala/part3typesdatasets/ComplexTypes.scala#L36

I guess it is more elegant and efficient to use coalesce, like:

val datePatterns = List("dd-MMM-yy", "yyyy-MM-dd")

moviesDF
    .select(
      col("Title"),
      col("Release_Date"),
      coalesce(
        datePatterns.map(to_date(col("Release_Date"), _)):_*
      ).as("Actual_Release"))

Maybe it could be helpful for someone learning spark.

syr avatar Sep 28 '21 14:09 syr