logstash-input-mongodb
logstash-input-mongodb copied to clipboard
Lost only one data when to elasticsearch
my logstash config:
input {
mongodb {
uri => 'mongodb://127.0.0.1:27017/wiki'
collection => 'wiki'
placeholder_db_dir => "E:/mongo2es/data"
placeholder_db_name => "datalogstash_sqlite_wiki.db"
batch_size => 1000
}
}
filter {
mutate {
remove_field => [ "host", "@version", "@timestamp", "logdate", "log_entry" ]
}
}
output {
stdout { codec => rubydebug }
file {
path => "E:/mongo2es/logs/mongo2es-wiki.log"
}
elasticsearch {
index => "wiki"
document_type => "wiki"
document_id => "%{mongo_id}"
hosts => ["127.0.0.1:9200"]
}
}
2767278 in mongodb, but 2767277 in elasticsearch.
Any thought? Thanks in Advance
It can be that elasticsearch cannot write one document because of different payload. for example if first document was stored with this payload: {"body": null} Then when logstash tries put document {"body":{"text":"some text"}} It will fail.
Elasticsearch usually puts errors like this into logs.
I export mongodb data to json file and read it line by line, then store json object to elasticsearch. In this way, no problem.
So I tried to import 190968 documents and one lost on elasticsearch (190967)
Hi I figured out why it losing one document you can have a look in pull request #41 https://github.com/phutchins/logstash-input-mongodb/pull/41/commits/6998388caf20c53748dcdfc55e5798b8d90bc56e#diff-b50cbd06ed9aac325fc5552aa327afbbR138
Hi, @bogdangi @shi-yuan
This is my configuration and have only one document in collection but logstash not exit the loop once documents are read. Any Suggestion. input { mongodb { uri => "mongodb://localhost:27017/logtry?ssl=false" placeholder_db_dir => "d:/elk" placeholder_db_name => "logstash_mo.db" collection => "sample" batch_size => 0 } } output {
stdout { codec => json }
}
D, [2016-07-03T17:50:46.408000 #8236] DEBUG -- : MONGODB | localhost:27017 | log try.find | SUCCEEDED | 0.034s D, [2016-07-03T17:50:46.487000 #8236] DEBUG -- : MONGODB | localhost:27017 | log try.listCollections | STARTED | {"listCollections"=>1, "cursor"=>{}, "filter"=>{ :name=>{"$not"=>/system.|$/}}} D, [2016-07-03T17:50:46.500000 #8236] DEBUG -- : MONGODB | localhost:27017 | log try.listCollections | SUCCEEDED | 0.008s D, [2016-07-03T17:50:47.810000 #8236] DEBUG -- : MONGODB | localhost:27017 | log try.find | STARTED | {"find"=>"sample", "filter"=>{"_id"=>{"$gt"=>BSON::ObjectId ('577902d7beb1f37c22e1f458')}}, "limit"=>0} D, [2016-07-03T17:50:47.821000 #8236] DEBUG -- : MONGODB | localhost:27017 | log try.find | SUCCEEDED | 0.004s D, [2016-07-03T17:50:47.968000 #8236] DEBUG -- : MONGODB | localhost:27017 | log try.listCollections | STARTED | {"listCollections"=>1, "cursor"=>{}, "filter"=>{ :name=>{"$not"=>/system.|$/}}} D, [2016-07-03T17:50:47.983000 #8236] DEBUG -- : MONGODB | localhost:27017 | log try.listCollections | SUCCEEDED | 0.009s D, [2016-07-03T17:50:50.600000 #8236] DEBUG -- : MONGODB | localhost:27017 | log try.find | STARTED | {"find"=>"sample", "filter"=>{"_id"=>{"$gt"=>BSON::ObjectId ('577902d7beb1f37c22e1f458')}}, "limit"=>0} D, [2016-07-03T17:50:50.612000 #8236] DEBUG -- : MONGODB | localhost:27017 | log try.find | SUCCEEDED | 0.006s
I guess answer is batch_size => 0
I am having the same problem.
I have 4 documents inside "merge" collection from my mongodb. When I run logstash, elastic search is loaded with 3 documents. I did some tests and I noticed that the lost document is always the first document in my mongodb collection. If I try with a collection that only have a single document, then nothing is loaded into elasticsearch.
My elasticsearch index is new and empty and this is my configuration file:
input { mongodb { uri => 'mongodb://--------hidden------/ch-db' placeholder_db_dir => 'C:/temp/' placeholder_db_name => 'logstash_sqlite.db' collection => 'merge' } }
output { elasticsearch { index => "poll" hosts => "localhost:9200" } }