spark-bigquery Different read and write number of rows

Different read and write number of rows

Open celoibarros opened this issue 4 years ago • 0 comments

Hi there,

I am getting some differences when reading from BQ and writing to SQL Server. I tried locally, if i set

.master("local") the read dataframe as 160k rows and after the dataframe write i only get 80k in SQL Server table.

If i run the same with .master("local[*]") and get the same number in the read and write.

But when i run the code in the cluster --master "yarn" --deploy_mode "cluster" i am still getting differences.

Do you have any idea what is happening? It looks like some "partitions" are not being write.

Best regards.

Jun 05 '20 14:06 celoibarros