imageflow-dotnet-server icon indicating copy to clipboard operation
imageflow-dotnet-server copied to clipboard

Does anyone want/need an S3/Azure blob cache?

Open lilith opened this issue 5 years ago • 16 comments

lilith avatar Dec 06 '20 06:12 lilith

This would be great assuming latency was in acceptable realms. My Imageflow cache is currently sitting on disk in an Azure App Service, but as far as I know, this sort of set up is not recommended by Microsoft.

ajbeaven avatar Dec 08 '20 02:12 ajbeaven

Possibly. Azure/AWS CDN would be my go-to for caching in the first instance and then falling back to internal caching, I.e. disk caching, but use of local drives is not recommended by Azure as @ajbeaven says, so for Azure it might make sense to be able to store long-lived cached copies of resized images.

I wouldn't bother though if it was going to be time consuming to develop or maintain.

JayVDZ avatar Dec 08 '20 08:12 JayVDZ

Since Azure and S3 offer object expiry, I could rely on developers to configure cache expiry and just implement cache misses and hits. That would be rather straightforward/easy to develop.

lilith avatar Dec 10 '20 20:12 lilith

Yes please! I'm a great proponent of this as it would greatly reduce my bandwidth bill, without me having to give much consideration to local storage needs.

The storage fees per GB are incredibly low, so I'd probably just never have it remove old entries.

AlexMedia avatar Mar 12 '21 15:03 AlexMedia

I was a little bored so I decided to have a go at this myself. I've taken some code from the DiskCache provider to generate the unique cache keys, and I then use the AWS SDK to get/put the objects in S3. This is all very rough around the edges (e.g. I haven't implemented any locking) hence why I'm sharing this as a gist rather than a pull request.

https://gist.github.com/AlexMedia/ccabfa4d766bc9991fad1f04af561584

Example usage (using AWSSDK.Extensions.NETCore.Setup):

var awsOptions = Configuration.GetAWSOptions();

services.AddImageflowS3Cache(() => awsOptions.CreateServiceClient<IAmazonS3>(),
  new S3CacheOptions
  {
	  BucketName = "imageflow-s3-cache",
	  Prefix = "cache"
  });

AlexMedia avatar Mar 14 '21 14:03 AlexMedia

@AlexMedia That looks great!

A couple tips:

  1. You can reuse the s3 client, it's thread-safe. Amazon suggests reuse actually. So if you want, you can even rely on the DI container registered s3 client.
  2. I don't think you really need locking, since it would be at most a throughput optimization benefit.
  3. I saw you're storing but not checking the status code from GetObject. I haven't checked to see if AWS already validates this.
  4. The big performance benefit on misses would be to return the result while uploading in the background.

In HybridCache, I use a size-bounded collection for async uploads/writes. I switch back to sync writes when I hit the configured limit so at least there is thread-based backpressure.

https://github.com/imazen/imageflow-dotnet-server/blob/main/src/Imazen.HybridCache/AsyncWriteCollection.cs https://github.com/imazen/imageflow-dotnet-server/blob/main/src/Imazen.HybridCache/AsyncWrite.cs

https://github.com/imazen/imageflow-dotnet-server/blob/main/src/Imazen.HybridCache/AsyncCache.cs#L276-L285 https://github.com/imazen/imageflow-dotnet-server/blob/main/src/Imazen.HybridCache/AsyncCache.cs#L418-L430

If you use the collection but no extra locking, you may have overlapping threads doing the same work, but at least you won't have repeated work when completed work is being uploaded.

lilith avatar Mar 30 '21 02:03 lilith

BTW, are you using this in production or have you done any latency testing?

lilith avatar Mar 30 '21 02:03 lilith

I wrote it as a quick-fix solution to keep bandwidth costs under control. I saw that a lot of bandwidth from my S3 compatible storage account came from the bucket which holds my images, which had an effect on costs. By storing every resized image ever generated I can easily keep my costs under control more easily.

I have it running in production (behind Cloudflare) but it's by no means a finished product. I appreciate your input, when I have time I'll take a look at whether I can optimise the code and maybe change this into a pull request :)

AlexMedia avatar Mar 30 '21 08:03 AlexMedia

having the ability to use Azure blob as a cache space will be our main incentive to move from the current image resizer to imageflow

rudym avatar Feb 09 '22 04:02 rudym

@AlexMedia Have you done additional work on this, or do have an updated gist?

lilith avatar May 01 '22 18:05 lilith

@lilith I'm afraid I haven't, I've moved on to other projects and this has kind of fallen to the side.

AlexMedia avatar May 04 '22 09:05 AlexMedia

Anyone has an update on this?

keremdemirer avatar Mar 30 '23 18:03 keremdemirer

I'd really like this to exist. We should think carefully about pluggability and configuration for S3 and Azure - do we want them configured as named services, or passed in based on the Azure and S3 plugins and a common interface to a generic Blob Cache implementation? I could see functionality here that goes far beyond simple key:value cache storage. For example, caching the dimensions of images to an accessible container could help with HTML generation, since width="" and height"" elements are supposed to be specified for performance.

On Thu, Mar 30, 2023 at 12:41 PM Kerem Demirer @.***> wrote:

Anyone has an update on this?

— Reply to this email directly, view it on GitHub https://github.com/imazen/imageflow-dotnet-server/issues/37#issuecomment-1490756571, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA2LH5PB3OUXRRM4NV732LW6XHUJANCNFSM4UPDDA4Q . You are receiving this because you were mentioned.Message ID: @.***>

lilith avatar Mar 30 '23 20:03 lilith

This one seems essential. Especially for apps using large source files.

I used resizer 3 and 4 with love a long time ago. I need to brush up my knowledge about the api's however would like help if you can guide me.

keremdemirer avatar Mar 30 '23 21:03 keremdemirer

@lilith What do you think about azure file shares?

keremdemirer avatar Apr 15 '23 18:04 keremdemirer

@keremdemirer They might work with HybridCache as-is, if latency is low enough. Have you tried them?

lilith avatar Aug 09 '23 15:08 lilith