gcp-ingestion
gcp-ingestion copied to clipboard
Recover original PubsubMessage when BigQuery inserts fail
As discussed in #215, the current implementation of BigQuery output leaves some things to be desired:
- we lose attributes for failedInserts (because we get back only the TableRow; not the original PubsubMessage)
- we rely on the less performant BigQueryIO.insertTableRows method rather than passing a function to convert PubsubMessage to TableRow, since there's no hook for error handling in the conversion function
worth noting that at the expected step for this to occur, the attributes we lose have been copied into the message. so this is inconvenient, but doesn't actually result in data loss for now.