spark-bigquery
spark-bigquery copied to clipboard
Different read and write number of rows
Hi there,
I am getting some differences when reading from BQ and writing to SQL Server. I tried locally, if i set
.master("local") the read dataframe as 160k rows and after the dataframe write i only get 80k in SQL Server table.
If i run the same with .master("local[*]") and get the same number in the read and write.
But when i run the code in the cluster --master "yarn" --deploy_mode "cluster" i am still getting differences.
Do you have any idea what is happening? It looks like some "partitions" are not being write.
Best regards.