Does anyone want/need an S3/Azure blob cache?
This would be great assuming latency was in acceptable realms. My Imageflow cache is currently sitting on disk in an Azure App Service, but as far as I know, this sort of set up is not recommended by Microsoft.
Possibly. Azure/AWS CDN would be my go-to for caching in the first instance and then falling back to internal caching, I.e. disk caching, but use of local drives is not recommended by Azure as @ajbeaven says, so for Azure it might make sense to be able to store long-lived cached copies of resized images.
I wouldn't bother though if it was going to be time consuming to develop or maintain.
Since Azure and S3 offer object expiry, I could rely on developers to configure cache expiry and just implement cache misses and hits. That would be rather straightforward/easy to develop.
Yes please! I'm a great proponent of this as it would greatly reduce my bandwidth bill, without me having to give much consideration to local storage needs.
The storage fees per GB are incredibly low, so I'd probably just never have it remove old entries.
I was a little bored so I decided to have a go at this myself. I've taken some code from the DiskCache provider to generate the unique cache keys, and I then use the AWS SDK to get/put the objects in S3. This is all very rough around the edges (e.g. I haven't implemented any locking) hence why I'm sharing this as a gist rather than a pull request.
https://gist.github.com/AlexMedia/ccabfa4d766bc9991fad1f04af561584
Example usage (using AWSSDK.Extensions.NETCore.Setup):
var awsOptions = Configuration.GetAWSOptions();
services.AddImageflowS3Cache(() => awsOptions.CreateServiceClient<IAmazonS3>(),
new S3CacheOptions
{
BucketName = "imageflow-s3-cache",
Prefix = "cache"
});
@AlexMedia That looks great!
A couple tips:
- You can reuse the s3 client, it's thread-safe. Amazon suggests reuse actually. So if you want, you can even rely on the DI container registered s3 client.
- I don't think you really need locking, since it would be at most a throughput optimization benefit.
- I saw you're storing but not checking the status code from GetObject. I haven't checked to see if AWS already validates this.
- The big performance benefit on misses would be to return the result while uploading in the background.
In HybridCache, I use a size-bounded collection for async uploads/writes. I switch back to sync writes when I hit the configured limit so at least there is thread-based backpressure.
https://github.com/imazen/imageflow-dotnet-server/blob/main/src/Imazen.HybridCache/AsyncWriteCollection.cs https://github.com/imazen/imageflow-dotnet-server/blob/main/src/Imazen.HybridCache/AsyncWrite.cs
https://github.com/imazen/imageflow-dotnet-server/blob/main/src/Imazen.HybridCache/AsyncCache.cs#L276-L285 https://github.com/imazen/imageflow-dotnet-server/blob/main/src/Imazen.HybridCache/AsyncCache.cs#L418-L430
If you use the collection but no extra locking, you may have overlapping threads doing the same work, but at least you won't have repeated work when completed work is being uploaded.
BTW, are you using this in production or have you done any latency testing?
I wrote it as a quick-fix solution to keep bandwidth costs under control. I saw that a lot of bandwidth from my S3 compatible storage account came from the bucket which holds my images, which had an effect on costs. By storing every resized image ever generated I can easily keep my costs under control more easily.
I have it running in production (behind Cloudflare) but it's by no means a finished product. I appreciate your input, when I have time I'll take a look at whether I can optimise the code and maybe change this into a pull request :)
having the ability to use Azure blob as a cache space will be our main incentive to move from the current image resizer to imageflow
@AlexMedia Have you done additional work on this, or do have an updated gist?
@lilith I'm afraid I haven't, I've moved on to other projects and this has kind of fallen to the side.
Anyone has an update on this?
I'd really like this to exist. We should think carefully about pluggability and configuration for S3 and Azure - do we want them configured as named services, or passed in based on the Azure and S3 plugins and a common interface to a generic Blob Cache implementation? I could see functionality here that goes far beyond simple key:value cache storage. For example, caching the dimensions of images to an accessible container could help with HTML generation, since width="" and height"" elements are supposed to be specified for performance.
On Thu, Mar 30, 2023 at 12:41 PM Kerem Demirer @.***> wrote:
Anyone has an update on this?
— Reply to this email directly, view it on GitHub https://github.com/imazen/imageflow-dotnet-server/issues/37#issuecomment-1490756571, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA2LH5PB3OUXRRM4NV732LW6XHUJANCNFSM4UPDDA4Q . You are receiving this because you were mentioned.Message ID: @.***>
This one seems essential. Especially for apps using large source files.
I used resizer 3 and 4 with love a long time ago. I need to brush up my knowledge about the api's however would like help if you can guide me.
@lilith What do you think about azure file shares?
@keremdemirer They might work with HybridCache as-is, if latency is low enough. Have you tried them?