rubyvideo icon indicating copy to clipboard operation
rubyvideo copied to clipboard

Add image proxy

Open hschne opened this issue 2 months ago • 2 comments

This is a follow up to #1049. Loading GitHub avatars en-masse results in rubyevents.org getting rate-limited by GitHub. To avoid this, we would want to serve the images ourselves. This could be accomplished by using ActiveStorage (as suggested in #881).

This PR instead adds a proxy for image requests to GitHub. Images are downloaded once and served from cache afterwards. This is much easier to set up, but has similar downsides to the ActiveStorage solution.

Considerations

  1. Initially, loading images is slow because we request them from GitHub. Once the cache is populated, requests are speedy enough.
  2. I opted to query for the user, which allows us to centralize avatar serving (by caching fallback images), but this adds a small overhead to serving images.
  3. We might serve stale avatars, which is why TTL is set to 1 day. Downloading/serving images via ActiveStorage has this issue too. We could manually invalidate the cache under certain conditions (e.g. when the user updates their profile/relinks their GitHub...) but I don't think that's worth the effort.
  4. Follow-ups include refactoring the user model & avatar helper, or maybe proxying images from Bsky as well :thinking:
  5. Heads up, locally Rack Mini Profiler screws with the browser cache, so images are never served from the browser cache locally.

Alternatives

Apart from using ActiveStorage, we could use something like Cloudflare workers to do much the same thing, as outlined here. I don't know enough about rubyevents.org's infrastructure to know if that's viable.

hschne avatar Oct 24 '25 08:10 hschne

@adrienpoly Here's a prototype implementation for cache-based GitHub avatars. Referring to your previous comment.

hschne avatar Oct 24 '25 08:10 hschne

That is an interesting solution I will probably push it to staging to test it in real life. Yet I am concerned that visiting the speakers page will generate a bloat of request for the server to handle where we would probably want those assets to be kind of static (or almost). Even if they are cached they still require a web request per image and that might be kind of slow.

If we would go for an active storage route we would need to be sure the storage provider support public routes so that we can serve the assets directly from the storage without going trough the web server

I ll look also in details about the cloudflare solution. We have cloudflare, but i am not a big expert of workers here

thanks for opening this PR to explore solutions

adrienpoly avatar Oct 24 '25 21:10 adrienpoly

Alternatively, using imgproxy could be an option. That would allow all third party images to be proxied and cached.

They have an official RubyGem with optional ActiveStorage integration: https://github.com/imgproxy/imgproxy.rb

Imgproxy.url_for("http://images.example.com/images/image.jpg")

That would let the overhead from the caching to be offloaded to a separate process or even a separate VM. It has support for signed URLs so that prevents the image proxy from becoming an open proxy…

It would be similar to how GitHub handles external images in READMEs and other markdown content.

matiaskorhonen avatar Nov 26 '25 11:11 matiaskorhonen

I used Imgproxy in the past, can recommend. Running it should be easy peasy with Kamal, not sure about monitoring it though. In any case, more services always means more work ☠️

It's definitely a great idea if we want to go with a 'grown-up' solution.

hschne avatar Nov 26 '25 16:11 hschne