rmongodb
rmongodb copied to clipboard
rmongodb aggregate fails on group by time field
Hi,
I have an aggregation pipeline grouping document that is working fine in the mongo shell, but not in rmongodb.
A sample document (after running through match and unwind stages) is pretty simple, like this: There is a time (t) and a value (v), and a time series identifier (ts)
{
"_id": {
"str": "55093a7ea99062278a877f9e"
},
"ts": {
"str": "55093a7ea99062278a877f9d"
},
"t": "2015-03-18T06:59:38.000Z",
"v": 4928737
}
The json for the group stage is just getting the average for each hour for each time series.
{
"$group": {
"_id": {
"ts":"$ts",
"hourofday":{"$hour":"$t"}
},
"v":{"$avg":"$v"}
}
}
When I try it with rmongodb, it fails with a error code 10.
This doesn't work:
group.bson = mongo.bson.from.JSON('{"$group":{"_id":{"ts":"$ts","hourofday":{"$hour":"$t"}},"v":{"$avg":"$v"}}}')
But if I remove the "hourofday" part of the id, then it works, apart from not being the right result:
This works:
group.bson = mongo.bson.from.JSON('{"$group":{"_id":{"ts":"$ts"}},"v":{"$avg":"$v"}}}')
I've tried constructing this with buffers, but that made no difference. The BSON that is output looks correct to me is this:
$group : 3
_id : 3
ts : 2 $ts
hourofday : 3
$hour : 2 $t
v : 3
$avg : 2 $v
@nathanwebb, can you post the unwind
query you have working under 1.8.0 of rmongodb
? I'm just getting started with this package and every unwind I try fails. I also get the group
related errors whenever I'm using something that references a field via quoting "$hour"
for instance. I'm suspecting some sort of error related to fields that need to be quoted.
Even using the example zips
file fails (error code 10) when running the commands:
pipe_test <- mongo.bson.from.JSON('{ "$unwind": "$city"}')
cmd_list <- list(pipe_test)
mongo.aggregation(m, "rmongodb.zips", cmd_list)
res <- mongo.aggregation(m, ns, cmd_list)
result <- mongo.bson.value(res, "result")
My unwind was almost identical to that and worked fine with 1.8.0. Just a guess, but I don't think the "city" field in the zips dataset is an array, so it might be failing because of that.
unwind_pipe <- mongo.bson.from.JSON('{"$unwind": "$point"}')
In order to get the $hour function to work, I've decided to do a sys call to a mongo shell script, which does the aggregation and prints the results in a csv to stdout. That was a bit of a pain, and I suspect that it is slower doing that, but it works well.
On 30 March 2015 at 06:51, David F. Severski [email protected] wrote:
@nathanwebb https://github.com/nathanwebb, can you post the unwind query you have working under 1.8.0 of rmongodb? I'm just getting started with this package and every unwind I try fails. I also get the group related errors whenever I'm using something that references a field via quoting "$hour" for instance. I'm suspecting some sort of error related to fields that need to be quoted. Even using the example zips file fails (error code 10) when running the commands:
pipe_test <- mongo.bson.from.JSON('{ "$unwind": "$city"}') cmd_list <- list(pipe_test) mongo.aggregation(m, "rmongodb.zips", cmd_list) res <- mongo.aggregation(m, ns, cmd_list) result <- mongo.bson.value(res, "result")
— Reply to this email directly or view it on GitHub https://github.com/mongosoup/rmongodb/issues/82#issuecomment-87462551.
Interesting issue. For me bson also looks correct. Can you please provide some data (bson dump) to play with?