spark-essentials
spark-essentials copied to clipboard
Alternative solution proposal
Instead of 'parse the DF multiple times, then union the small DFs' https://github.com/rockthejvm/spark-essentials/blob/c8ee4b251129dec82bdcd37bbfe62516ad0481f7/src/main/scala/part3typesdatasets/ComplexTypes.scala#L36
I guess it is more elegant and efficient to use coalesce, like:
val datePatterns = List("dd-MMM-yy", "yyyy-MM-dd")
moviesDF
.select(
col("Title"),
col("Release_Date"),
coalesce(
datePatterns.map(to_date(col("Release_Date"), _)):_*
).as("Actual_Release"))
Maybe it could be helpful for someone learning spark.