rmongodb icon indicating copy to clipboard operation
rmongodb copied to clipboard

rmongodb aggregate fails on group by time field

Open nathanwebb opened this issue 9 years ago • 3 comments

Hi,

I have an aggregation pipeline grouping document that is working fine in the mongo shell, but not in rmongodb.

A sample document (after running through match and unwind stages) is pretty simple, like this: There is a time (t) and a value (v), and a time series identifier (ts)

    {
      "_id": {
        "str": "55093a7ea99062278a877f9e"
      },
      "ts": {
        "str": "55093a7ea99062278a877f9d"
      },
      "t": "2015-03-18T06:59:38.000Z",
      "v": 4928737
    }

The json for the group stage is just getting the average for each hour for each time series.

{
  "$group":  {
       "_id": {
           "ts":"$ts",
           "hourofday":{"$hour":"$t"}
        },
        "v":{"$avg":"$v"}
     }
 }

When I try it with rmongodb, it fails with a error code 10.

This doesn't work:

group.bson = mongo.bson.from.JSON('{"$group":{"_id":{"ts":"$ts","hourofday":{"$hour":"$t"}},"v":{"$avg":"$v"}}}')

But if I remove the "hourofday" part of the id, then it works, apart from not being the right result:

This works:

group.bson = mongo.bson.from.JSON('{"$group":{"_id":{"ts":"$ts"}},"v":{"$avg":"$v"}}}')

I've tried constructing this with buffers, but that made no difference. The BSON that is output looks correct to me is this:

    $group : 3   
        _id : 3      
            ts : 2   $ts
            hourofday : 3    
                $hour : 2    $t


        v : 3    
            $avg : 2     $v

nathanwebb avatar Mar 24 '15 05:03 nathanwebb

@nathanwebb, can you post the unwind query you have working under 1.8.0 of rmongodb? I'm just getting started with this package and every unwind I try fails. I also get the group related errors whenever I'm using something that references a field via quoting "$hour" for instance. I'm suspecting some sort of error related to fields that need to be quoted. Even using the example zips file fails (error code 10) when running the commands:

pipe_test <- mongo.bson.from.JSON('{ "$unwind": "$city"}')
cmd_list <- list(pipe_test)
mongo.aggregation(m, "rmongodb.zips", cmd_list)
res <- mongo.aggregation(m, ns, cmd_list)
result <- mongo.bson.value(res, "result")

davidski avatar Mar 29 '15 19:03 davidski

My unwind was almost identical to that and worked fine with 1.8.0. Just a guess, but I don't think the "city" field in the zips dataset is an array, so it might be failing because of that.

unwind_pipe <- mongo.bson.from.JSON('{"$unwind": "$point"}')

In order to get the $hour function to work, I've decided to do a sys call to a mongo shell script, which does the aggregation and prints the results in a csv to stdout. That was a bit of a pain, and I suspect that it is slower doing that, but it works well.

On 30 March 2015 at 06:51, David F. Severski [email protected] wrote:

@nathanwebb https://github.com/nathanwebb, can you post the unwind query you have working under 1.8.0 of rmongodb? I'm just getting started with this package and every unwind I try fails. I also get the group related errors whenever I'm using something that references a field via quoting "$hour" for instance. I'm suspecting some sort of error related to fields that need to be quoted. Even using the example zips file fails (error code 10) when running the commands:

pipe_test <- mongo.bson.from.JSON('{ "$unwind": "$city"}') cmd_list <- list(pipe_test) mongo.aggregation(m, "rmongodb.zips", cmd_list) res <- mongo.aggregation(m, ns, cmd_list) result <- mongo.bson.value(res, "result")

— Reply to this email directly or view it on GitHub https://github.com/mongosoup/rmongodb/issues/82#issuecomment-87462551.

nathanwebb avatar Mar 30 '15 01:03 nathanwebb

Interesting issue. For me bson also looks correct. Can you please provide some data (bson dump) to play with?

dselivanov avatar Apr 02 '15 10:04 dselivanov