logstash-input-mongodb
logstash-input-mongodb copied to clipboard
The first data is lost
I have two data in my mongodb
_id ObjectId time String ip String
1 5c2492cb768d4d1d5c440e59 "16:52" No field 2 5c2492d9768d4d1d5c440e5a No field "100.99.9.1"
Only the second data can be read through the plugin
[2019-01-09T10:38:01,459][INFO ][logstash.inputs.mongodb ] Registering MongoDB input { "logdate" => "2018-12-27T08:52:41+00:00", "log_entry" => "{"_id"=>BSON::ObjectId('5c2492d9768d4d1d5c440e5a'), "ip"=>"100.99.9.1"}", "@version" => "1", "@timestamp" => 2019-01-09T02:38:03.720Z, "host" => "greenvm-l14185v1", "ip" => "100.99.9.1", "mongo_id" => "5c2492d9768d4d1d5c440e5a", "uid" => "5c2492d9768d4d1d5c440e5a" }
My logstash.conf
input { mongodb { uri => 'mongodb://ip:27017/log' placeholder_db_dir => '/tmp/logstash-mongodb/' placeholder_db_name =>'logstash_sqlite.db' collection => 'audit_log' } }
That is correct, the code sorts the since column in ascending order and uses the first value as its placeholder. Then when it does a fetch it fetches everything greater than the placeholder, so the first value is not fetched.
See also this issue, which links to a PR which changes collection.find({:_id => {:$gt => last_id_object}})
to use $gte. That could result in duplicates, but would be in the spirit of logstash's "at least once" delivery model.
I have the same problem