feast icon indicating copy to clipboard operation
feast copied to clipboard

BigQuery join fails with more than 10GB of data

Open rculbertson opened this issue 4 months ago • 2 comments

Expected Behavior

Users should be able to join more than 10GB of data in BigQuery.

Current Behavior

If the joined data is more than 10GB, BigQuery fails with this error:

Response too large to return. Consider specifying a destination table in your job configuration. For more details, see https://cloud.google.com/bigquery/troubleshooting-errors at [139:1]

As stated in the error message, and in the BigQuery docs here, you can avoid this error by specifying a destination table. As a workaround for another issue, Feast intentionally does not set a destination table (see this line). Instead Feast writes to a temporary table, and then as a second step copies the temporary table to a permanent table. This is working around an issue where BigQuery does not let you set a destination table when running a script.

Steps to reproduce

Using BigQuery, call get_historical_features, on any dataset where the resulting data is more than 10GB.

Specifications

  • Version: 0.36.0
  • Platform:
  • Subsystem:

Possible Solution

rculbertson avatar Mar 05 '24 14:03 rculbertson

great! thanks for your feedbacks, we've also been aware of this as well. let me follow up on a PR

sudohainguyen avatar Mar 06 '24 16:03 sudohainguyen

Thanks @sudohainguyen !

rculbertson avatar Mar 06 '24 16:03 rculbertson