openverse-api icon indicating copy to clipboard operation
openverse-api copied to clipboard

Random image endpoint

Open zackkrida opened this issue 3 years ago • 6 comments

Problem

Many of the images in Openverse, particularly from Flickr, have never actually been viewed by someone before.

Description

It could be compelling to help people find images that have never been viewed before. A /random endpoint could be a cool way to do this. We could use the random_scoring feature in ElasticSearch to facilitate this.

zackkrida avatar Mar 04 '22 16:03 zackkrida

Are we psychically connected? I was thinking about this last night!

It'd be nice to add some parameters to it to confine it optionally. For example, it'd be cool to be able to do /images/random?providers=... or even better /images/random?provider_group=glam for example. Maybe both!

Whatever we use could also be used to create a random/daily endpoint as well that caches the response for 24 hours. This could power a "Picture of the day" kind of feature (could be used to make Openverse a provider for KDE's POTD desktop plugin or those various "new tab" plugins that show a nice new picture every day).

sarayourfriend avatar Mar 04 '22 16:03 sarayourfriend

It might be possible to use the same set of filters as search for this random image endpoint. Filters like resolution and maybe even license, considering how attribution is not viable in wallpapers, would be very useful.

dhruvkb avatar Mar 05 '22 12:03 dhruvkb

In https://github.com/WordPress/openverse-api/pull/554#issuecomment-1065142995 @zackkrida noted that license and license_type parameters can conflict in some sense. If you use the commerical license type filter but then add a non-commercial license filter, what should it do?

My gut tells me that the license_type is essentially a mask of licenses and it should just do something like this:

licenses_to_search = get_licenses_for_group(license_type) + licenses_from_param

And call it a day. This eliminates the potential for conflict, needing to resolve anything, and allows user refined searches that still benefit from the concept of filter groups.

For example, I believe the following should be a valid search (and I can easily imagine a use case for it):

/v1/images/?provider_group=glam&provider=non-glam-provider

You could want the GLAM designated providers and also to include a specific provider that is not in that group for whatever reason.

sarayourfriend avatar Mar 11 '22 15:03 sarayourfriend

Did some research on this and built a local demo to test the functionality. The key is to use a function_score query which wraps the original search query and then multiplies the random scores with the document's own scores, effectively shuffling them.

s.query = Q("function_score", query=s.query, random_score={})

dhruvkb avatar May 15 '22 09:05 dhruvkb

@dhruvkb PR time! 😆 Hah, kidding mostly, but I really would love to look into how much work this would take to expand on and launch. The documentation for Unsplash's similar endpoint might be helpful there.

zackkrida avatar May 16 '22 21:05 zackkrida

I have it working pretty well locally using the code snippet above but I'm waiting for #696 and #699 to be merged first.

dhruvkb avatar May 17 '22 04:05 dhruvkb