malloy
malloy copied to clipboard
path-less aggregates on repeated records produce unexpected results
I don't yet have a reproducible test case, but here are some fragments from the Slack exchange. For some customers, the symmetric aggregate code is producing numbers which are clearly wrong. Not sure why
source: invoices is table('...') {
dimension: flexsave_amount is (ii.price * ii.quantity) / ii.usdXchange
}
source: flexsave is from (invoices -> {
group_by: customer_id, invoice_date_month
aggregate: flexsave_total is sum(flexsave_amount)
})
ii
is a repeated record, and the correct expression is aggregate: flexsave_total is ii.sum(flexsave_amount)
we probably should error on this instead of writing bad sql
@lloydtabb what would the test be to catch this and error on it ... is it the dimension:
which references repeated records
Look at all fields referenced in a SUM. Make sure the struct paths of all dimensional the entities referenced do not have "one to many joins" relative to the struct path of the sum.
I could also catch this in malloy_query...
it would extend to any any aggregate which would resort to symmetric magic to compute the aggregation ... not sure how this should be reflected into the future function magic that @christopherswenson is working on.
I think this is the same issue I ran into when trying to build a sample analysis on the GA4 public dataset. The queries in this file should provide an example where the issue is reproducible.