openverse icon indicating copy to clipboard operation
openverse copied to clipboard

xeno-canto (bird sounds from around the world)

Open sarayourfriend opened this issue 2 years ago • 6 comments

Provider API Endpoint / Documentation

https://xeno-canto.org/explore/api (although may need to be avoided, see technical info below)

Provider description

Example search for sardinian warbler, you can see most if not all recordings are CC licensed: https://xeno-canto.org/explore?query=Sylvia%20melanocephala

Licenses Provided

CC generally, most appear to BY-SA-NC, ~though v3 for all of them as far as I can tell~ (this turned out to be wrong, they link to v3 licenses on their terms of service page but in actuality recordings use a diversity of versions). More info on terms of use page: https://xeno-canto.org/about/terms

Provider API Technical info

From the terms of usage page:

Server Resources

xeno-canto runs a server with specifications appropriate for rather intensive use by many users at the same time. Unfortunately the server cannot usually accomodate indiscriminate automated requests such as mass downloads of pages or files. Such use of the site is (actively) discouraged especially if it deteriorates the user experience or if it interferes with site maintenance. Requests for the transfer of large amounts of data, for any use allowed by the license, are of course welcome at the contact address below.

Therefore, we should not use the API. But along with developing some kind of "data dump" process for WordPress/openverse#1608, whatever process we use for that could be used to include some kind of data dump provided by xeno-canto. I will contact them and see if they're open to the possibility of providing us with a data dump (or if a method for ingesting their catalog already exists).

The checklist below is left incomplete until I've verified with xeno-canto that it's possible to ingest their data somehow.

Checklist to complete before beginning development

  • [ ] Verify there is a way to retrieve the entire relevant portion of the provider's collection in a systematic way via their API.
  • [ ] Verify the API provides license info (license type and version; license URL provides both, and is preferred)
  • [ ] Verify the API provides stable direct links to individual works.
  • [ ] Verify the API provides a stable landing page URL to individual works.
  • [ ] Note other info the API provides, such as thumbnails, dimensions, attribution info (required if non-CC0 licenses will be kept), title, description, other meta data, tags, etc.
  • [ ] Attach example responses to API queries that have the relevant info.

Implementation

  • [ ] 🙋 I would be interested in implementing this feature.

sarayourfriend avatar May 20 '22 17:05 sarayourfriend