serilog-sinks-azureblobstorage File is created multiple times per hour

Hello,

When configuring to write to blob storage like below: "Name": "AzureBlobStorage", "Args": { "connectionString": "[connection here]", "storageContainerName": "blobContainer", "storageFileName": "logs/{yyyy}{MM}/{dd}{HH}bloblogs.txt", "outputTemplate": "{Timestamp:o} [{Level:u3}] ({Application}/{MachineName}/{ThreadId}) {Message}{NewLine}{Exception}" }

I'm seeing that it is logging multiple files in an hour like the format below.

logs/202006/1000bloblogs.txt logs/202006/1000bloblogs-001.txt logs/202006/1000bloblogs-002.txt logs/202006/1000bloblogs-500.txt

It is creating a new file within the same hour, but it is not due to size, since it varies and it is creating more than 500 files per hour.

Any idea why is it behaving like this?

Jun 10 '20 02:06 esgallardo

Can you give this a try in 2.0.2? There was a pull request that was in this that I think may improve the naming of files based on date/hour values. See the note in the readme on the project homepage.

Aug 29 '20 23:08 chriswill

@chriswill

We are using version 3.0.3 and this still appears to be an issue.

Do you have any insight into why this might be happening? If not, do you have any suggestion on where in the code I might look (generally)?

Feb 18 '22 17:02 cw-andrews

@cw-andrews By any chance, are you doing a new deployment or making any changes to your app at this time?

I've seen in the Serilog File provider that it creates a new file like this whenever I deploy a new release, even if the original file was still small and should not have rolled.

Feb 18 '22 19:02 chriswill

@chriswill I am trying to cross-check that with my team. But the example I showed from Wednesday seems to all show the same last modified date.

Everything we have using your package are Azure Functions via an app service plan. Maybe that has something to do with it?

Feb 18 '22 19:02 cw-andrews

@cw-andrews I created a dotnet-isolated function app and tested it both locally and as deployed to Azure. I'm not seeing any issues such as you are describing.

Is your function app an isolated app or the original flavor, and which version of .net are you using?

Feb 24 '22 17:02 chriswill

@esgallardo I also tested with your file name and am not seeing files created multiple times.

Feb 24 '22 17:02 chriswill

Hi there,

I'm facing the same issue: sometimes, several log files get created in Azure Storage with sizes way below the configured size limit and almost at the same time.

By analyzing some of my log files, and for what other people is experiencing, I took some time to check the code out.

My findings are the following:

Usually, when several log files are created at the same time, the first one HAS reached a size limit (blobSizeLimitBytes setting), and the files that precede (could be 1, 2 or more) are created with just a few bytes, except (maybe) for the last one.

For example, I recovered these files (please be aware that the Last Modified column is in d/m/yyyy format). See the size of the -001 file: 20230302.txt: The first log file in the day, reached the configured size limit (2MB) 20230302-001.txt: The file gets created almost the exact time when the first one was flushed. 20230302-002.txt: This is also created by the same time, but contains further events.

Interesting facts

20230302.txt: last event in this file is this (content is not relevant, only the time and thread ID): 2023-03-02 14:42:51.407 TID 0243 [INF] Route matched with...
20230302-001.txt: contains just one event: 2023-03-02 14:42:51.428 TID 0243 [INF] Entity Framework Core 6.0.7 initialized ...
20230302-002.txt: contains several events, but only the first one is relevant! 2023-03-02 14:42:51.467 TID 0173 [INF] Executing endpoint

For what I could deduce, the -001 and -002 were created almost at the same time the original log was being flushed, because of a condition not checked in a multithreaded context: -> the first file almost reached its limit -> the logger received events from different threads almost at the same time, and it considered its time to roll a new file -> the logger (on one thread) creates a new file when saving events (from thread ID 243) -> the logger (on another parallel thread) creates another new file when saving events (from thread ID 173) -> the condition for creating a new file is finally reset and the logging continues normally

I think the problem might be in the file DefaultCloudBlobProvider.cs, in method GetCloudBlobAsync. I added an hypothetical "lock" to the example below to avoid concurrent threads getting false from ValidateBlobProperties:

public async Task<AppendBlobClient> GetCloudBlobAsync(BlobServiceClient blobServiceClient, string blobContainerName, string blobName, bool bypassBlobCreationValidation, long? blobSizeLimitBytes = null)
        {
            // Check if the current known blob is the targeted blob
            if (currentAppendBlobClient != null && currentBlobName.Equals(blobName, StringComparison.OrdinalIgnoreCase))
            {
                // Before performing validate first fetch attributes for current file size
                Response<BlobProperties> propertiesResponse = await currentAppendBlobClient.GetPropertiesAsync().ConfigureAwait(false);
                BlobProperties properties = propertiesResponse.Value;

                // Check if the current blob is within the block count and file size limits
                lock (some_shared_lock_object) // <-- this lock could avoid multiple files creation in parallel
                                if (ValidateBlobProperties(properties, blobSizeLimitBytes))
                                {
                                    return currentAppendBlobClient;
                                }

                // The blob is correct but needs to be rolled over
                currentBlobRollSequence++;
                await GetAppendBlobClientAsync(blobServiceClient, blobContainerName, blobName, bypassBlobCreationValidation);
            }
            else
            {
                //first time to get a cloudblob or the blobname has changed
                currentBlobRollSequence = 0;
                await GetAppendBlobClientAsync(blobServiceClient, blobContainerName, blobName, bypassBlobCreationValidation, blobSizeLimitBytes);
            }

            return currentAppendBlobClient;
        }

Could you please consider if this is what's happening, and the solution is something like a shared lock missing?

Thank you for your great contribution!

Jul 15 '23 23:07 alegaio7

I know this issue is from a long time ago, but since it's still opened I'd like to ask if you could give a review to my post @chriswill

The reply is a little bit long, but just because I described my findings, otherwise the solution (or at least what I think it'd be), it's not that complicated.

Thanks!

Jul 31 '23 15:07 alegaio7

I tested this scenario with the 4.0.0 release and didn't see the issues reported in this thread; so I am closing.

Jul 17 '24 20:07 chriswill

serilog-sinks-azureblobstorage serilog-sinks-azureblobstorage copied to clipboard

File is created multiple times per hour

My findings are the following:

Interesting facts

serilog-sinks-azureblobstorage
serilog-sinks-azureblobstorage copied to clipboard