taxonworks
taxonworks copied to clipboard
Idea: show GBIF "enrichments"
Feature or enhancement
Data published through GBIF go through a series of checks and "enrichments", such as aligning scientificNames to the complete name as published in the nomenclature databases, flagging issues like a country coordinate mismatch, and detecting related records such as herbarium duplicates which may carry more data (e.g. images) or have a different identification/determination.
I gather TaxonWorks now makes it possible to publish in GBIF. It would be a fairly simple integration on specimen pages, to be able to present the user with the observations GBIF makes on the published records. This could be pulled using the /occurrence/N
, /occurrence/N/verbatim
and the /occurrence/N/experimental/related
APIs. If the GBIF ID (N) is not known the datasetKey
and local occurrenceID
can be used e.g. like this. GBIF could provide a different endpoint that provides it all (JSON or HTML) if useful.
If it is of interest we would be happy to discuss how to do this, and how it should look, and offer a PR with implementation.
CC @mjy
Location
Specimen pages
Screenshot, napkin sketch of interface, or conceptual description
No response
Your role
Aggregator of content, looking for ways to give back
@timrobertson100 Thanks. Jim Beach's example of this yesterday got me thinking about it as well.
I think we're actually going to spin this off as a software agnostic module to do this, as a (more than) proof of concept. The idea will be Google maps widget style, anyone can add it to their page after including the JS, into their pipeline, via CDN etc.
This will further explore our thought of a DwC "vector" (data + headers) being a key UI driving data structure. I imagine that a minimal initialization will be possible with a single configuration option, among many other ways:
<div id='my_comparsion' class='vue_dwc_enrichments'
data-source-dwc="https://api.taxonworks.org/api/v1/collection_objects/123?extend[]=dwc_fields"
data-option-1
data-option-n
>
@jlpereira is on holiday, we'll start poking at this for our updated CollectionObject page when he gets back.
Thanks, @mjy
One idea could be GBIF offers this kind of JS widget that provides a view of the record that was published, and the enrichments we apply. You may prefer to make it a TaxonWorks-specific thing or even software agnostic, bringing in more than just GBIF, like the Specify example Jim showed. It might make sense for TW to do one, and still for GBIF to offer this as standard for other publishers.
Either way, let us know if we can help
Of course we'd love it if GBIF built widgets like this that would work in JS pipelines. We've spun off a handful of key internal libraries ourselves (see below for a couple examples), and it's in our mission to try and do this where possible, so I don't mind working on a very basic version of this one.
- https://github.com/SpeciesFileGroup/sled
- https://github.com/SpeciesFileGroup/svg_radial_menu
- https://github.com/SpeciesFileGroup/taxonworks_autocomplete
- https://github.com/SpeciesFileGroup/waxy
So maybe we'll try it and GBIF could refine, add code, fork it, use it for inspiration etc. as they see fit. Or if you beat us too it and provide a npm module we'll just use that.
@timrobertson100 Is there an API endpoint like /v1/occurence/:occurrenceId
? We want to hit a restful resource using a unique id to get a single record.
@timrobertson100 Is it possible to get the original data from GBIF, not just the interpreted?
Is there an API endpoint like /v1/occurence/:occurrenceId?
Because occurrenceID
from the publisher isn’t unique, you need the datasetKey
as well, so .../occurrence/<datasetKey>/<occurrenceID>
such as this example.
You can find datasetKey's using the registry API. If you don't have those, then you'd need to search using .../occurrence/search?occurrenceID=123
but you'd need to disambiguate results when there are more than one.
@timrobertson100 Is it possible to get the original data from GBIF, not just the interpreted?
Yes. We have 3 versions of the record:
- The raw view that the publisher provided as picked up on our crawling stream;
.../occurrence/gbifID/fragment
. This may return JSON text for DwC-A or XML or possible other text data. It's intended mainly for diagnostics. - The verbatim view which captures the raw data (1) reformated into Darwin Core without interpretation beyond what is needed to represent in DwC;
.../occurrence/gbifID/verbatim
- The interpreted view which you will be most familiar using;
.../occurrence/gbifID
or the method above
The API docs are also here.
Please say if you need more. Thanks.
@timrobertson100
Because occurrenceID is unique,
Do you mean not (globally) unique?
Yes. Sorry about that. Corrected above
the gbifID is unique but what publishers provide in occurrenceID isn’t necessarily unique
@timrobertson100 https://github.com/SpeciesFileGroup/gbifference.
We have that widget going live in 0.30.0 this week. More docs coming there. Closing this for issue tracking there.
Thanks for letting me know - and congrats. Please ping us if you need anything changed.