spark-citus
spark-citus copied to clipboard
how to implement it to a data frame
Hi,
Do you have any sample code ,when data is a dataframe with rows in it.
I don't have an example, have you tried converting to / from RDD? e.g. calling .rdd https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset or https://spark.apache.org/docs/latest/sql-getting-started.html#interoperating-with-rdds
Hi Koeninger,
so our data is extracted from postgres,tranformed and written back to it. While writting back we want it to write directly to worker nodes. So data is already in the form of dataframe.
-
tried converting the dataframe to indexedseq as in your example code, but it fails with java.lang.ArrayStoreException: org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema
-
is that the right way or need to have different case for data where datatype is dataframe.