rmongodb
rmongodb copied to clipboard
Use rmongodb datatype to read bson objects from stdin in R
hey guys. I just try to connect hadoop streaming with R and i thought about the datatypes from rmongodb may help me out.
So this is the idea
Hadoop Streaming[hadoop-mongo-connector] -> mapper.py -> reducer.R
the mapper is really straight forward using the implementation from pymongo_hadoop see https://github.com/mongodb/mongo-hadoop/tree/master/streaming/language_support/python
i want something like iterating over the stdin.
conn <- file("stdin", open="r")
buf <- mongo.bson.buffer.create()
// R does not allow that bcause conn is not the correct datatype
mongo.bson.buffer.append.raw( conn )
// iterate over buf
someone out there has a smart idea?