fluent-plugin-s3
fluent-plugin-s3 copied to clipboard
S3 cost optimisation: Remember the last index value that was used
The problem
Suppose you use the defaults:
-
s3_object_key_format
of%{path}%{time_slice}_%{index}.%{file_extension}
-
time_slice_format
of%Y%m%d%H
And suppose you flush every 30 seconds. So 120 files per hour.
The first file will check whether foo-2016010109_1.gz
exists via an S3 HEAD request, see it doesn't exist, and then upload to that filename.
The next file will be first check whether foo-2016010109_1.gz
exists via an S3 HEAD request, see it exists, and so increment the index to foo-2016010109_2.gz
, check whether it exists via with an S3 HEAD request, see it doesn't exist, and then upload to the filename.
This will continue. When we get to the final file of the hour (the 120th file), we'll first do 119 HEAD requests!
That's 1+2+...+119 = 7140 S3 requests over the hour. And that's per input log file, per instance.
S3 HEAD requests are "$0.004 per 10,000 requests". So the monthly cost of the above, for 5 log files for 100 instances amounts to 7140_5_100_24_30*$0.004/1000 = $1028
More generally, 1+2+...+n is O(n^2) and we can reduce this to O(n).
Solutions
(a) The user can modify time_slice_format
to include %M
. Or the default could include %M
.
(b) fluent-plugin-s3 could remember the last index it uploaded to, and so not have to check whether the n-1 earlier files already exist: fluentd would know they do.
If either solution was implemented, we'd've reduced the number of HEAD requests from O(n^2) to O(n). (Technically (a) doesn't reduce the solution to O(n^2), it just makes our n tiny.)
So rather than 7140 S3 requests per hour per log file per instance, we'd only do 120.
This reduces the monthly cost from $1028 to $17.
The issue with (b) is if you have multiple instances writing to the same path, you'd still need to have the collision check.
Oops. I updated (b) to clarify that the HEAD request would still occur for the index we're trying, but it then wouldn't need to try the n-1 earlier requests.
If we're up to index n, we needn't check 1, 2, ..., n-1. But fluentd-plugin-s3 seems to: See https://github.com/fluent/fluent-plugin-s3/blob/master/lib/fluent/plugin/out_s3.rb#L190 and line 219.
I should say also it was @cnorthwood who spotted this issue. :)
For less sensationalism, perhaps a more common case of having 20 instances uploading 5 files every 5 minutes costs $2.2 with the O(n^2) and would cost $0.34 with the O(n).
Hard point of (b) is s3 plugin has num_threads
.
If num_threads is larger than 2, "last index" is unclear in flushing threads.
Hmm...
(a) The user can modify time_slice_format to include %M. Or the default could include %M.
If default includes %M
, flush cycle is shorter. Maybe, almost users is not expeced because it pus lots of files on S3.
This is not efficient for several processing engines, e.g. Hadoop or imporintg files into DWH.
Some users add %{hex_random}
to s3_object_key_format
.
This reduces requests and improves performance.
For reducing HEAD cost, one way is adding skip_existence_check
like option with ${hex_random} / ${uuid_flush}.
But I'm not sure this is good or not...
we also encountered the same problem. The main reason for us to use s3 plugin was to reduce costs for storing huge log files. We use fluentd+s3 next to regular ELK-like solution. But now actually we are paying more for s3 as we would pay to increase our main log storage.
We ran into this issue as well. There's been months where we received literally billions of HEAD requests. It's hard to say how much it has cost us exactly but based on our historical usage I'd guess around 5,000 dollars†. That's right, Jeff Bezos has gotten so much money off this bug he could buy a used 2005 mazda mazda3:
(Just kidding, but the point is we should probably warn newcomers or fix the bug unless we want to buy Bezos another car :P )
Hard point of (b) is s3 plugin has num_threads.
You could create a variable to store the last log name and lock it behind a mutex so threads can set & access it safely. See #3 in https://blog.arkency.com/3-ways-to-make-your-ruby-object-thread-safe/
I'm tempted to try it out but I don't know ruby... hmm.
There's also a slight problem with the approach in that at the start of the program you would still have to go through all the prefixed s3 items to find the last index. If td-agent is restarted frequently or if there is a crazy amount of logs in the bucket this could also lead to high costs.
I think it would be better to make ${hex_random}
and check_object true
the default. That would require no complicated code changes and would avoid the problem entirely. The downside to that is it becomes slightly harder for a user to find the latest log in s3. They would have to look through the s3 last modified timestamp in all logs in the latest prefixed hour/hour/day/month. For most users this is probably not a big deal. The other downside is that this would require a new major release version. I still think it's worth it, assuming no other downsides. Any objections to this approach?
As a temporary workaround we have simplified our td-agent config and upgraded to td-agent v4. I'm not even sure how we ran into this problem - the config is set to flush only once every 24 hours, but somehow we flushed about once a minute. The really odd thing is the rapid flushing happened in a periodic cycle, where it would continue for weeks and then pause for long time before starting back up again. I'll be monitoring the # of requests to see if our flushing problem (and thus this problem) is solved for us.
† This is a very rough guess, I won't know for sure until I see how much costs come down by.
You could create a variable to store the last log name and lock it behind a mutex so threads can set & access it safely.
I think this has slow upload issue with multiple threads. If previous upload takes 20 seconds, other threads wait 20 seconds or more. It causes buffer overflow easily.
Dlder s3 plugin doesn't have check_object
parameter, so some old comments are outdated.
I think documentation update is needed for this case. Add S3 cost optimization
section and mention check_object
/ random placeholder is better for high volume environment.
@caleb15 Apart from the warning added - is there a recommended solution to avoid the index checking ? which in turn will avoid the HEAD requests? This is the %{path}%{hostname}-%{time_slice}_%{index}.%{file_extension} s3_object_key_format we are using. We are seeing high number of HEAD requests. Does this https://github.com/fluent/fluent-plugin-s3/issues/326 locally cache the last index for the file between threads to avoid the issue? Thanks!
For avoiding HEAD request call, set check_object false
.
In addition. you need to change s3_object_key_format
to use %{uuid_flush}
or ${chunk_id}
for unique path.
thanks @repeatedly what is the implication of avoiding HEAD request call ? Would it overwrite existing file ? or know from cache what index it needs to write to? Guessing %{uuid_flush} is the id then picked up from cache
@formanojhr uuid = universally unique id. Because the id is totally unique overwriting becomes virtually impossible.
https://stackoverflow.com/questions/1155008/how-unique-is-uuid
@repeatedly I think since you are the dev on this might be something you can answer. Once I add the check_object false and adding s3_object_key_format %{path}%{hostname}-%{uuid_flush}-%{time_slice}_%{index}.%{file_extension} check_object=false what we notice is the HEAD requests really drop to zero but also the PUT requests go down to zero for though we do see the logs being written. Any idea if a different protocol is being used to write?
This issue has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this issue will be closed in 30 days
@cosmo0920 can you remove the stale lifecycle and add a bug label please. This is a rare but serious issue.
I agree this seems to be a bug. I have also experienced this default behavior where we only found out about it after $20K bill at the end of the month due to the billions of HEAD requests. I switched to check_object=false, and used a combo of UID and other variables to ensure uniqueness and avoid conflicts - of course the downside is the challenge of looking up objects sequentially. I do believe this needs to be treated with higher urgency or at least switch the default behavior to where it is not making billions of HEAD calls. Some may argue this can be a feature with "suffix" as a variable with options and documenting the pros and cons of the options.