logstash-input-mongodb icon indicating copy to clipboard operation
logstash-input-mongodb copied to clipboard

The first data is lost

Open StuCurry opened this issue 6 years ago • 2 comments

I have two data in my mongodb

_id ObjectId                                   time String    ip String

1 5c2492cb768d4d1d5c440e59 "16:52" No field 2 5c2492d9768d4d1d5c440e5a No field "100.99.9.1"

Only the second data can be read through the plugin

[2019-01-09T10:38:01,459][INFO ][logstash.inputs.mongodb ] Registering MongoDB input { "logdate" => "2018-12-27T08:52:41+00:00", "log_entry" => "{"_id"=>BSON::ObjectId('5c2492d9768d4d1d5c440e5a'), "ip"=>"100.99.9.1"}", "@version" => "1", "@timestamp" => 2019-01-09T02:38:03.720Z, "host" => "greenvm-l14185v1", "ip" => "100.99.9.1", "mongo_id" => "5c2492d9768d4d1d5c440e5a", "uid" => "5c2492d9768d4d1d5c440e5a" }

My logstash.conf

input { mongodb { uri => 'mongodb://ip:27017/log' placeholder_db_dir => '/tmp/logstash-mongodb/' placeholder_db_name =>'logstash_sqlite.db' collection => 'audit_log' } }

StuCurry avatar Jan 09 '19 02:01 StuCurry

That is correct, the code sorts the since column in ascending order and uses the first value as its placeholder. Then when it does a fetch it fetches everything greater than the placeholder, so the first value is not fetched.

See also this issue, which links to a PR which changes collection.find({:_id => {:$gt => last_id_object}}) to use $gte. That could result in duplicates, but would be in the spirit of logstash's "at least once" delivery model.

TheVastyDeep avatar Jun 03 '20 19:06 TheVastyDeep

I have the same problem

binsonHao avatar Jan 27 '24 06:01 binsonHao