systemds icon indicating copy to clipboard operation
systemds copied to clipboard

[SYSTEMDS-3413] Add support for row/col aggregation to countDistinct builtin function

Open BACtaki opened this issue 3 years ago • 3 comments

This patch converts countDistinct() from a non-parameterized builtin to a parameterized builtin function to allow for 1 new parameter: dir for direction. The value of dir can be r and c, denoting row-wise and column-wise aggregation respectively. This patch only implements CP and the SP case will throw a NotImplementedException()- the latter case will be addressed in a subsequent patch.

BACtaki avatar Aug 03 '22 20:08 BACtaki

@phaniarnab @Baunsgaard @j143, would love your feedback!

BACtaki avatar Aug 03 '22 20:08 BACtaki

I will look into this @BACtaki. Meanwhile, please check why the tests are failing.

phaniarnab avatar Aug 08 '22 11:08 phaniarnab

I will look into this @BACtaki. Meanwhile, please check why the tests are failing.

Thanks @phaniarnab ! Ack, let me look into the test failures.

BACtaki avatar Aug 09 '22 16:08 BACtaki

I like the PR, i do not see in the code why the tests would fail so i will force them to restart,

Thanks for reviewing @Baunsgaard and @phaniarnab ! Re the test failures, the following careless edit was the culprit:

@@ -2361,10 +2375,6 @@ public class DMLTranslator
 		case SUM:
 		case PROD:
 		case VAR:
-		case COUNT_DISTINCT:
-			currBuiltinOp = new AggUnaryOp(target.getName(), DataType.SCALAR, target.getValueType(),
-					AggOp.valueOf(source.getOpCode().name()), Direction.RowCol, expr);
-			break;

is the method supposed to be: countDistinctCol(X) or countDistinct(X, "c") Personally i would prefer if we had both, maybe @mboehm7 have an opinion?

That is a great idea! I have added countDistinctRow and countDistinctCol aliases for countDistinct(X, dir="r") and countDistinct(X, dir="c") respectively.

BACtaki avatar Oct 24 '22 02:10 BACtaki

Thanks for the PR, while merging i changed a wildcard to individual imports. we should avoid wildcard imports.

Baunsgaard avatar Oct 28 '22 12:10 Baunsgaard