logstash-input-cloudwatch-logs icon indicating copy to clipboard operation
logstash-input-cloudwatch-logs copied to clipboard

Clarify documentation on start_position / sincedb

Open danielmcquillen opened this issue 8 years ago • 7 comments

I'm a bit confused by the documentation on how to use start_position.

If I want to re-run logstash and pick up every event from the beginning of my log group, it looks like I need to set start_position to "beginning."

However, the field is marked default, which is confusing to me. Wouldn't that mean every time I run logstash (without defining any value for that field) it would start from the first entry in my log group?

Wouldn't it be better to say the default for start_position is blank/none, meaning it will just pick up on the last date saved in sincedb?

I've just started using this (awesome) plugin so I could be way off in my understanding. Thanks for clarifying.

danielmcquillen avatar Oct 24 '17 22:10 danielmcquillen

Hi @danielmcquillen - yes, perhaps clearer documentation is in order.

As it currently stands, start_position is only respected if there is no entry for a given log group in your sincedb.

This allows you to specify how "newly seen" log groups will begin ingesting (from the start, from the end, or via a step-back from the current time).

The default, if you do not specify a start_position, when it sees a new log group, is to start from the first entry in the log group, and work forward. This will update your since-db as it ingests, and subsequent start-ups of ingestion on this log-group will work from the since-db's start point, ignoring the start_position parameter.

Does this clear things up?

lukewaite avatar Oct 31 '17 01:10 lukewaite

Hi @lukewaite . Ok cool. So if at anytime I want to completely reload everything, I need to remove sincedb. Is that right?

danielmcquillen avatar Oct 31 '17 03:10 danielmcquillen

Yep. That’s correct. On Mon, Oct 30, 2017 at 11:55 PM Daniel McQuillen [email protected] wrote:

Hi @lukewaite https://github.com/lukewaite . Ok cool. So if at anytime I want to completely reload everything, I need to remove sincedb. Is that right?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/lukewaite/logstash-input-cloudwatch-logs/issues/44#issuecomment-340652868, or mute the thread https://github.com/notifications/unsubscribe-auth/AAlukixyvnKpwGCkrLMUQQAdgurhTWwXks5sxpowgaJpZM4QFNUc .

lukewaite avatar Oct 31 '17 10:10 lukewaite

Does the path for sincedb need to be created prior to run? At the moment I'm using the default but am not seeing any logs or the .sincedb being created.

mikejr83 avatar May 30 '18 18:05 mikejr83

Hello @lukewaite

While trying to set start_position => 86400, it is not getting data from one day past.

How can I only ask for today data and onwards from CW data to input into logstash?

Thanking u in advance...

It is super important for me to setup as old data are too big to move and we dont need as well

kabinfh avatar Jul 24 '18 14:07 kabinfh

Is there any solution for it? I have the same question of @kabinfh start_position seems not work

william-amaral avatar May 26 '20 17:05 william-amaral

@lukewaite Thanks for this awesome plugin. I also have the same question as of @kabinfh

The logstash pipeline seems to be pulling very older logs even though the start_position => 86400 set.

Is there any work around for this?

Thanks in advance!!

jeyakumar8 avatar Nov 14 '22 12:11 jeyakumar8