kafka-connect-bigquery icon indicating copy to clipboard operation
kafka-connect-bigquery copied to clipboard

Unable to insert data into table with field partition

Open darshanmehta10 opened this issue 5 years ago • 2 comments

There are 2 parts of this issue:

1. When a BigQuery insert fails, TableWriter class always assumes that the error is related to batch size and doesn't log it. E.g. this line needs to be present in the release candidates (currently it's only in master). While streaming the data, we got this exception from BigQuery:

Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
  "code" : 400,
  "errors" : [ {
    "domain" : "global",
    "message" : "Streaming to metadata partition of column based partitioning table <project>:<dataset>.<table>$20191031 is disallowed.",
    "reason" : "invalid"
  } ],
  "message" : "Streaming to metadata partition of column based partitioning table <project>:<dataset>.<table>$20191031 is disallowed.",
  "status" : "INVALID_ARGUMENT"
}

However, the component never logged that and instead, logged the following:

Attempted to reduce batch size below 1

after running out of splitting the batches.

2. It fails to insert into BigQuery when we use a field to partition the table (rather than legacy partitioning). It's happening because of this implementation. It always appends the partition decorator which eventually results in the below error:

Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
  "code" : 400,
  "errors" : [ {
    "domain" : "global",
    "message" : "Streaming to metadata partition of column based partitioning table <project>:<dataset>.<table>$20191031 is disallowed.",
    "reason" : "invalid"
  } ],
  "message" : "Streaming to metadata partition of column based partitioning table <project>:<dataset>.<table>$20191031 is disallowed.",
  "status" : "INVALID_ARGUMENT"
}

That logic needs to be conditional and controlled by a configuration parameter.

darshanmehta10 avatar Oct 31 '19 16:10 darshanmehta10

+1 Experiencing both these issues too.

archy-bold avatar Nov 29 '19 17:11 archy-bold

@aarchy-bold This is fixed in PR https://github.com/wepay/kafka-connect-bigquery/pull/229

darshanmehta10 avatar Dec 19 '19 11:12 darshanmehta10