cubes icon indicating copy to clipboard operation
cubes copied to clipboard

how to do conditional aggregation with expression

Open ghost opened this issue 9 years ago • 9 comments

Hi,

I am trying to use the expression in aggregate to do some conditional aggregation.

For example: {"name": "daily_volume_1", "expression": "sum(case when flag == 1 then volume else 0 end)"}

Thanks in advance!

ghost avatar Nov 20 '15 17:11 ghost

We have a similar scenario. Roughly, our data is somewhat like this:

movement      amount
in               5
out              2

Therefore, when I aggregate to get final amount, the final amount should be (+5)+(-2). So we would like to have something like:

"expression": "sum(case when movement == 'in' then amount else (-1*amount) end)"

Is it achievable? What should we do if it doesn't exist and we want to contribute?

gigix avatar Dec 04 '15 09:12 gigix

+1

cuipengfei avatar Dec 07 '15 14:12 cuipengfei

+1

season-py avatar Dec 18 '15 02:12 season-py

@yinhao: In your example: {"name": "daily_volume_1", "expression": "sum(case when flag == 1 then volume else 0 end)"} You can use to function "sign" in expression:

in measure:  {"name": "daily_volume_1", "expression": "volume*sign(flag)"}
in aggregate: { "function": "sum", "measure":"daily_volume_1"  }

About "sign" function

sign(flag) = 1 if flag > 0, =-1 if flag < 0 and =0 if flag =0

namndev avatar Jan 12 '16 02:01 namndev

anybody found the solution?

russelmahmud avatar Apr 02 '17 18:04 russelmahmud

@Stiivi isn't it possible at all to add case condition in measure's expression? Most of the databases support case statement and from the code it seems that cubes supports it. For few days, i was trying to make it work but couldn't find any solution and there isn't any example in the web.

russelmahmud avatar Apr 02 '17 19:04 russelmahmud

@russelmahmud it is possible to add it. One intentional limitation of the expressions is that they should be plain arithmetic expressions with literals, variables, binary/unary operators and functions. All special SQL constructs have to be converted to one or combination of multiple of the mentioned elements.

For the problem of conditional aggregation of measure I thin we sohuld add an if function as it is common in spreadsheets: if(test, true_expression, false_expression).

The most correct place is to add it somewhere around here in the cubes.sql.expressions module.

The function should look like something like this:

def if_then_else(test: ColumnElement, true_value: ColumnElement, false_value: ColumnElement) → ColumnElement:
    return sa.case([(test, true_value)], else_:false_value)

See also the SQLAlchemy case() function

I will add it later, when I'm done with current 2.0 refactoring. Pull-requests are welcome as well.

Stiivi avatar Apr 02 '17 19:04 Stiivi

I have initiated this PR https://github.com/DataBrewery/cubes/pull/433

russelmahmud avatar Apr 03 '17 07:04 russelmahmud

Solved with #433 in 1.x. Needs porting to 2.0.

Stiivi avatar Apr 03 '17 17:04 Stiivi