beam icon indicating copy to clipboard operation
beam copied to clipboard

Update BigQuerySinkMetrics for StreamingInserts.

Open JayajP opened this issue 1 year ago • 3 comments

Update the following metrics after we insert a batch of rows using BigQuery's InsertAll RPC.

1. RowsAppendedCount Counter

Tracks the status of BigQuery Rows after the batch of InsertAll RPCs is completed.

Metric has labels the following labels:

  • RowStatus: Status of the BigQuery rows after , one of SUCCESSFUL, RETRIED, FAILED
  • RpcStatus: Rpc Status
  • TableId: 'datasets/{ }/tables/{ }' that the rows are sent to.

Rows are updated based on the following criteria.

  • Rows that are successfully inserted will have their RowStatus set to successful.
  • Rows that have to be retried because the RPC failed will have their RowStatus set to RETRIED and RpcStatus set to the failed RPC's status.
  • Rows might fail to be inserted even when the InsertAll RPC succeeds. These rows will have their RowStatus set to RETRIED and RpcStatus set to INTERNAL.
  • Rows that fail to be inserted will RowStatus set to RETRIED and RpcStatus set to INTERNAL.

2. RpcRequests Counter

Tracks the Rpc Status of the InsertAll Rpc.

Metric has labels the following labels:

  • Method: BigQuery sink method. Will be STREAMING_INSERTS for these metrics.
  • RpcStatus: Rpc Status
  • TableId: 'datasets/{ }/tables/{ }' that the rows are sent to.

3. RpcLatency Histogram:

Tracks the Rpc Latency of the InsertAll Rpc.

Metric has labels the following labels:

  • Method: BigQuery sink method. Will be STREAMING_INSERTS for these metrics.

Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • [ ] Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • [ ] Update CHANGES.md with noteworthy changes.
  • [ ] If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels Python tests Java tests Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

JayajP avatar Feb 14 '24 21:02 JayajP

R: @m-trieu

JayajP avatar Feb 15 '24 18:02 JayajP

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control

github-actions[bot] avatar Feb 15 '24 18:02 github-actions[bot]

Thanks for the review Martin, switched StreamingInsertsResults to an AutoValue class, made the instance-variables private and added some update methods to update the instance-variables.

JayajP avatar Feb 16 '24 22:02 JayajP