elasticsearch-river-mongodb icon indicating copy to clipboard operation
elasticsearch-river-mongodb copied to clipboard

River isn't indexing from mongoDB

Open akluffy opened this issue 9 years ago • 7 comments

HI,

I tried to get it work a month ago in a single AWS EC2 on which installed MongoDB Version: 3.0.4 and ElasticSearch Version: 1.6.0, River Version: 2.0.9.

However, when I tried to deploy MongoDB and ElasticSearch on different AWS EC2 instances, it doesn't work any more! MongoDB Version: 3.0.4 ElasticSearch Version: 1.6.0 River Version: 2.0.9 Mapper Attachments Type Version: 2.7.0 and 2.6.0 (Tried both)

Note: The connection is very good. I used the command "mongo 52.24.225.108@mydatabase", it works just fine!!

The river config is below: screen shot 2015-07-11 at 1 02 28 am

akluffy avatar Jul 11 '15 08:07 akluffy

For test and reproduce bug purpose, I did another test. So please just forget about the configure pic above.

Case1: Installed mongodb and elasticsearch on the same machine, say M1 (AWS EC2 Ubuntu). M1's IP is 52.27.8.35. In this case, it works just fine! Case2: Installed Elasticsearch on a different machine, say M2. Trying to create a river from M2(Elasticsearch) to M1(MongoDB). Nah, this case doesn't work.

Elasticsearch's configuration is the same: curl -XPUT localhost:9200/_river/test/_meta -d '{ "type": "mongodb", "mongodb": { "servers": [ { "host": "52.27.8.35", "port": 27017 } ], "db": "test", "collection": "random", "options": { "secondary_read_preference": true }, "gridfs": false }, "index": { "name": "test", "type": "random" } }' screen shot 2015-07-11 at 11 02 23 pm

Let me show two different logs here.

First, this is the log for the case 1: [2015-07-12 05:26:34,448][INFO ][cluster.metadata ] [Man-Beast] [_river] creating index, cause [auto(index api)], templates [], shards [1]/[1], mappings [test] [2015-07-12 05:26:34,490][INFO ][cluster.metadata ] [Man-Beast] [_river] update_mapping test [2015-07-12 05:26:34,491][INFO ][river ] [Man-Beast] rivers have been deprecated. Read https://www.elastic.co/blog/deprecating_rivers [2015-07-12 05:26:34,492][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB River Plugin - version[2.0.9] - hash[73ddea5] - time[2015-04-06T21:16:46Z] [2015-07-12 05:26:34,492][INFO ][river.mongodb.util ] setRiverStatus called with test - RUNNING [2015-07-12 05:26:34,493][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] River test startup pending [2015-07-12 05:26:34,495][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] Starting river test [2015-07-12 05:26:34,496][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [5000], gridfs [false], filter [null], db [test], collection [random], script [null], indexing to [test]/[random] [2015-07-12 05:26:34,512][INFO ][cluster.metadata ] [Man-Beast] [test] creating index, cause [api], templates [], shards [5]/[1], mappings [] [2015-07-12 05:26:34,577][INFO ][org.elasticsearch.river.mongodb.MongoConfigProvider] MongoDB version - 3.0.4 [2015-07-12 05:26:34,600][INFO ][org.elasticsearch.river.mongodb.CollectionSlurper] MongoDBRiver is beginning initial import of test.random [2015-07-12 05:26:34,601][INFO ][org.elasticsearch.river.mongodb.CollectionSlurper] Number of documents indexed in initial import of test.random: 55 [2015-07-12 05:26:34,643][INFO ][cluster.metadata ] [Man-Beast] [_river] update_mapping test [2015-07-12 05:26:34,643][INFO ][cluster.metadata ] [Man-Beast] [test] update_mapping random [2015-07-12 05:26:34,660][INFO ][cluster.metadata ] [Man-Beast] [test] update_mapping random [2015-07-12 05:26:34,672][INFO ][cluster.metadata ] [Man-Beast] [test] update_mapping random [2015-07-12 05:26:34,685][INFO ][cluster.metadata ] [Man-Beast] [_river] update_mapping test

[2015-07-12 05:26:35,101][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] Started river test

And, this is the log for the case 2: [2015-07-12 05:40:45,126][INFO ][cluster.metadata ] [Drax the Destroyer] [_river] creating index, cause [auto(index api)], templates [], shards [1]/[1], mappings [test] [2015-07-12 05:40:45,200][INFO ][cluster.metadata ] [Drax the Destroyer] [_river] update_mapping test [2015-07-12 05:40:45,201][INFO ][river ] [Drax the Destroyer] rivers have been deprecated. Read https://www.elastic.co/blog/deprecating_rivers [2015-07-12 05:40:45,202][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB River Plugin - version[2.0.9] - hash[73ddea5] - time[2015-04-06T21:16:46Z] [2015-07-12 05:40:45,202][INFO ][river.mongodb.util ] setRiverStatus called with test - RUNNING [2015-07-12 05:40:45,208][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] River test startup pending [2015-07-12 05:40:45,211][INFO ][cluster.metadata ] [Drax the Destroyer] [_river] update_mapping test [2015-07-12 05:40:45,216][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] Starting river test [2015-07-12 05:40:45,217][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB options: secondaryreadpreference [true], drop_collection [false], include_collection [], throttlesize [5000], gridfs [false], filter [null], db [test], collection [random], script [null], indexing to [test]/[random] [2015-07-12 05:40:45,237][INFO ][cluster.metadata ] [Drax the Destroyer] [test] creating index, cause [api], templates [], shards [5]/[1], mappings [] [2015-07-12 05:40:45,354][INFO ][cluster.metadata ] [Drax the Destroyer] [_river] update_mapping test

Note: Connection is perfect

You can try typing the command: mongo 52.27.8.35/test What makes me really confused is that the it does work on the same machine but will not work on distributed systems. Why? Version's problem??

akluffy avatar Jul 11 '15 08:07 akluffy

see here https://github.com/richardwilly98/elasticsearch-river-mongodb/issues/548#issuecomment-122620187

twistedfategit avatar Jul 19 '15 03:07 twistedfategit

I just make it run on Centos .ES version:1.6.0,mongodb version:3.0.2 Remove the line "options":{...} Maybe secondary node has no oplog

hzm1029 avatar Jul 20 '15 09:07 hzm1029

@hzm1029 Still doesn't work after deleting "options"

akluffy avatar Jul 25 '15 18:07 akluffy

I run into this problem also: elasticsearch 1.4.2, mongodb 3.0, river plugin 2.0.9

I've tried downgrade mongodb/river plugin version, but none of that works..

It's seems stopped at connnecting mongodb, but mongo host:port/testmongo connection is good. Here's my log

[2015-08-12 03:40:43,067][INFO ][node                     ] [Doctor Leery] version[1.4.2], pid[1], build[927caff/2014-12-16T14:11:12Z]
[2015-08-12 03:40:43,068][INFO ][node                     ] [Doctor Leery] initializing ...
[2015-08-12 03:40:43,113][INFO ][plugins                  ] [Doctor Leery] loaded [mapper-attachments, mongodb-river], sites [river-mongodb]
[2015-08-12 03:40:45,388][INFO ][node                     ] [Doctor Leery] initialized
[2015-08-12 03:40:45,392][INFO ][node                     ] [Doctor Leery] starting ...
[2015-08-12 03:40:45,498][INFO ][transport                ] [Doctor Leery] bound_address {inet[/0:0:0:0:0:0:0:0:9300]}, publish_address {inet[/172.17.0.48:9300]}
[2015-08-12 03:40:45,524][INFO ][discovery                ] [Doctor Leery] elasticsearch/WU5DqE_HRqGwNI1mpUwtWQ
[2015-08-12 03:40:49,292][INFO ][cluster.service          ] [Doctor Leery] new_master [Doctor Leery][WU5DqE_HRqGwNI1mpUwtWQ][ca838da745f2][inet[/172.17.0.48:9300]], reason: zen-disco-join (elected_as_master)
[2015-08-12 03:40:49,314][INFO ][http                     ] [Doctor Leery] bound_address {inet[/0:0:0:0:0:0:0:0:9200]}, publish_address {inet[/172.17.0.48:9200]}
[2015-08-12 03:40:49,314][INFO ][node                     ] [Doctor Leery] started
[2015-08-12 03:40:49,318][INFO ][gateway                  ] [Doctor Leery] recovered [0] indices into cluster_state
[2015-08-12 03:41:25,830][INFO ][cluster.metadata         ] [Doctor Leery] [_river] creating index, cause [auto(index api)], shards [1]/[1], mappings [mongodb]
[2015-08-12 03:41:26,147][INFO ][cluster.metadata         ] [Doctor Leery] [_river] update_mapping [mongodb] (dynamic)
[2015-08-12 03:41:27,171][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB River Plugin - version[2.0.9] - hash[73ddea5] - time[2015-04-06T21:16:46Z]
[2015-08-12 03:41:27,182][INFO ][river.mongodb.util       ] setRiverStatus called with mongodb - RUNNING
[2015-08-12 03:41:27,185][INFO ][cluster.metadata         ] [Doctor Leery] [_river] update_mapping [mongodb] (dynamic)
[2015-08-12 03:41:27,190][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] River mongodb startup pending
[2015-08-12 03:41:27,217][INFO ][cluster.metadata         ] [Doctor Leery] [_river] update_mapping [mongodb] (dynamic)
[2015-08-12 03:41:27,220][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] Starting river mongodb
[2015-08-12 03:41:27,220][INFO ][org.elasticsearch.river.mongodb.MongoDBRiver] MongoDB options: secondaryreadpreference [false], drop_collection [false], include_collection [], throttlesize [5000], gridfs [false], filter [null], db [testmongo], collection [person], script [null], indexing to [mongoindex]/[person]
[2015-08-12 03:41:27,252][INFO ][cluster.metadata         ] [Doctor Leery] [mongoindex] creating index, cause [api], shards [5]/[1], mappings []
[2015-08-12 03:41:27,377][INFO ][river.mongodb            ] [Doctor Leery] Creating MongoClient for [[123.59.43.117:27017]]

617f9277-72da-4194-8bdf-973a18286e93

themez avatar Aug 12 '15 03:08 themez

I figured out the problem is mongoClient is not correctly connected,

because my replica set config is like this:

{
    "_id" : "rs0",
    "version" : 2,
    "members" : [
        {
            "_id" : 0,
            "host" : "dev:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : 0,
            "votes" : 1
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatTimeoutSecs" : 10,
        "getLastErrorModes" : {

        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        }
    }
}

the hostname dev cannot be resolved by elasticsearch machine, I reconfig the replica member host then it works fine.

@akluffy in your case 2, you install elasticsearch on a different machine, maybe you had the same problem as mine?

themez avatar Aug 12 '15 07:08 themez

Yeah it should be the same problem

Sent from my iPhone

On Aug 12, 2015, at 12:34 AM, ThemeZ [email protected] wrote:

I figured out the problem is mongoClient is not correctly connected,

because my replica set config is like this:

{ "_id" : "rs0", "version" : 2, "members" : [ { "_id" : 0, "host" : "dev:27017", "arbiterOnly" : false, "buildIndexes" : true, "hidden" : false, "priority" : 1, "tags" : {

        },
        "slaveDelay" : 0,
        "votes" : 1
    }
],
"settings" : {
    "chainingAllowed" : true,
    "heartbeatTimeoutSecs" : 10,
    "getLastErrorModes" : {

    },
    "getLastErrorDefaults" : {
        "w" : 1,
        "wtimeout" : 0
    }
}

} the hostname dev cannot be resolved by elasticsearch machine, I reconfig the replica member host then it works fine.

@akluffy in your case 2, you install elasticsearch on a different machine, maybe you had the same problem as mine?

— Reply to this email directly or view it on GitHub.

akluffy avatar Aug 12 '15 08:08 akluffy