ibis
ibis copied to clipboard
feat(bigquery): support for `WITH AGGREGATION_THRESHOLD` in aggregations
Is your feature request related to a problem?
BigQuery customers can set aggregation threshold analysis rules to protect privacy-sensitive data. If they have setup such rules then they need to use a WITH AGGREGATION_THRESHOLD clause when querying the table.
SELECT WITH AGGREGATION_THRESHOLD
test_id, COUNT(DISTINCT last_name) AS student_count
FROM mydataset.ExamView
GROUP BY test_id;
from https://cloud.google.com/bigquery/docs/analysis-rules#view_in_privacy_query
Describe the solution you'd like
A new parameter to Table.aggregate and/or Table.groupby would seem to be the right place to add this.
Alternatively, maybe a new pre-groupby table expression type for a thresholded table.
What version of ibis are you running?
N/A
What backend(s) are you using, if any?
BigQuery
Code of Conduct
- [X] I agree to follow this project's Code of Conduct
Can we think of it as an arbitrary query setting similar for example to what clickhouse has?
Can we think of it as an arbitrary query setting similar for example to what clickhouse has?
I haven't used clickhouse, but it looks pretty similar. Clickhouse looks like it supports general key/values, but there's an extra layer of syntax in BigQuery, with each feature enablement having its own sub-options.
There is a related (sub)query-scoped option specifically for privacy options via SELECT [ WITH differential_privacy_clause ], which is documented as part of the general SELECT syntax.
I don't actually see AGGREGATION_THRESHOLD listed there, but from the examples, the AGGREGATION_THRESHOLD clause looks like it'd be parsed and scoped to the (sub)query in the same way.