plywood
plywood copied to clipboard
unsupported aggregate action while running .collect()
Hi I'm not able to run .collect()
. I get the the following error:
unsupported aggregate action
Digging through the source code I found there is no handler for collect. There are corresponding countToAggregation
, countDistinctToAggregation
etc. but no collectToAggregation
. Is that the issue?
https://github.com/implydata/plywood/blob/master/src/external/utils/druidAggregationBuilder.ts#L190
@lastlegion Could you post your query (make it more generic if you need to)?
You should be running collect
on a DATASET
type object. It will hit https://github.com/implydata/plywood/blob/master/src/expressions/baseExpression.ts#L1431 to use a CollectExpression
I may be able to help out more if I understand your exact query.
I've been trying different variants of the following query:
var ex4 = ply().apply("d", $("dataset").filter($("field2").in(0,1)))
.apply("dd2", $("d").collect("$field2"))
or
var ex4 = ply().apply("d", $("dataset").filter($("field2").in(0,1)))
.apply("dd", $("d").select("field1", "field2"))
.apply("dd2", $("dd").collect("$field2"))
and I get the following error:
(node:55886) UnhandledPromiseRejectionWarning: Unhandled promise rejection (rejection id: 1): Error: unsupported aggregate action $__SEGMENT__:DATASET.collect($HR:NUMBER) (as __VALUE__)
I tried replacing collect
with other aggregation functions like count
etc. and was able to get the right results. Thanks @robertervin for your help!
@lastlegion You're completely correct on this. I also ran into this issue when I tried running it.
What you want to do instead is first split by field2
. This will issue a groupBy
query to Druid, which is much more efficient than pulling all field2
values into memory in javascript then making them distinct.
So your first query would turn into
var ex4 = ply()
.apply("d", $("dataset")
.filter($("field2").in(0,1))
)
.apply("dd2", $("d")
.split({id: "$field2"})
.collect("$id")
)
Perhaps @vogievetsky may be able to shed some light on why Plywood doesn't perform this by default, or if my logic on the reasoning is correct.
Thanks for your help! Yes I'm able to the desired output by applying .split()
before .collect()
. Yes if this is a bug then I'm willing to put in a pull request or help with documenting it.
@lastlegion You're welcome! I'm not actually a member of Imply, so I can't accept any PRs or anything. I do think the documentation is fairly good, but could could definitely be improved.
You can try submitting a PR for it in https://github.com/implydata/plywood/blob/master/docs/expressions.md and calling out @vogievetsky (who I believe is the sole owner of this codebase).
Would appreciate if you could close this issue as well since it's fixed.
Yes I agree the documentation is really good! I'm not sure about this particular case, if this is the desired behavior. I'll keep it open for @vogievetsky to comment on.