bdq
bdq copied to clipboard
TG2-AMENDMENT_TYPESTATUS_STANDARDIZED
TestField | Value |
---|---|
GUID | b3471c65-b53e-453b-8282-abfa27bf1805 |
Label | AMENDMENT_TYPESTATUS_STANDARDIZED |
Description | Proposes an amendment to the value of dwc:typeStatus using the bdq:sourceAuthority. |
TestType | Amendment |
Darwin Core Class | dwc:Occurrence |
Information Elements ActedUpon | dwc:typeStatus |
Information Elements Consulted | |
Expected Response | EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:typeStatus is bdq:Empty; AMENDED the value of the first word in each | delimited portion of dwc:typeStatus if it can be unambiguously matched to a term in the bdq:sourceAuthority; otherwise NOT_AMENDED. |
Data Quality Dimension | Conformance |
Term-Actions | TYPESTATUS_STANDARDIZED |
Parameter(s) | bdq:sourceAuthority |
Source Authority | bdq:sourceAuthority default = "GBIF TypeStatus Vocabulary" {[https://api.gbif.org/v1/vocabularies/TypeStatus]} {dwc:typeStatus vocabulary API [https://api.gbif.org/v1/vocabularies/TypeStatus/concepts]} |
Specification Last Updated | 2024-11-11 |
Examples | [dwc:typeStatus="Holo.": Response.status=AMENDED, Response.result=dwc:typeStatus="Holotype", Response.comment="dwc:typeStatus found in the bdq:sourceAuthority"] |
[dwc:typeStatus="x": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:typeStatus not found in the bdq:sourceAuthority"] | |
Source | TG2 |
References |
|
Example Implementations (Mechanisms) | |
Link to Specification Source Code | |
Notes | Valuable for data quality needs related to voucher specimens in natural science collections. Almost all occurrence data will have no value in dwc:typeStatus. For reference, a vocabulary of synonyms can be found for dwc:typeStatus at [https://registry.gbif.org/vocabulary/TypeStatus/concepts. |
@chicoreus @tucotuco I'm not sure that the link I have for the API is an actual API or if one exists (https://gbif.github.io/parsers/apidocs/org/gbif/api/vocabulary/TypeStatus.htm) thus the NEEDS WORK label
@CecSve mentioned in #284 that GBIF is working on the typeStatus vocabulary https://github.com/gbif/vocabulary/issues/87 Flagging this here.
Changed to Immature/Incomplete pending development of Vocabulary by GBIF
Changed to Immature/Incomplete pending development of Vocabulary by GBIF
GBIF has a vocabulary, it just isn't accessible via API from the vocabulary server. Implementations don't necessarily need an API to function. In fact, they would be more efficient or much more efficient without API calls, depending on how they were implemented. In other words, I do not think that having API access to a controlled vocabulary is a requirement for implementation, but having a controlled vocabulary is.
I am happy with that @tucotuco. Any comments @chicoreus?
Changed to CORE and deleted some wording from Notes. Left as "NEEDS WORK" following discussion with @chicoreus on need for MEASURE test. More discussion needed.
Not sure that this is tractable. The expectation for values in dwc:typeStatus is a pipe delimited list of {type status term of taxon name {publication}}. The definition explicitly includes the taxon name as part of the expected value: "A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the subject."
One example includes citation information, the other just type status term and taxon name.
For just type status terms and taxon names, we could probably manage with two source authorities, one for the type status term and one for the taxon name, but with publication citations included, that will not be tractable.
We might get away with conforming the first word of each pipe delimited block to a type status term vocabulary.
Examples in Darwin Core are:
holotype of Ctenomys sociabilis. Pearson O. P., and M. I. Christie. 1985. Historia Natural, 5(37):388
holotype of Pinus abies | holotype of Picea abies
We might also make a change term request for dwc:typeStatus and see if that flies.
Interesting - perhaps we need to do what @tucotuco suggests. Originally, I thought we were just checking against a list of types of Types regardless of other data such as the taxon and the publication. We generally look at terms in isolation, but I wasn't realising Darwin Core included the taxon name and publication. That certainly makes it a lot more difficult and wonder if it is still worth keeping (as CORE at least - possibly as SUPPLEMENTARY). I believe our original thoughts were to just test to see if the type of type was included in a vocabulary - holotype, neotype, lectotype, etc. (i.e. as in https://rs.gbif.org/vocabulary/gbif/type_status_2021-01-18.xml). My suggestion would be to drop this test as I can't think of another way to word it so it is consistent with Darwin Core - i.e. taking just the first part of the Darwin Core definition ("A list (concatenated and separated) of nomenclatural types (type status") without the second part. Perhaps the suggestion by @tucotuco or a new Darwin Core term - but it is too late for that for us.
Perhaps do what @tucotuco suggests and in the meantime drop to Incomplete/Immature.
Alternative is to split into parts by the pipe character and evaluate the first word of each part.
Perhaps something like:
EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:typeStatus is EMPTY; AMENDED the value of the first word in each | delimited portion of dwc:typeStatus if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED
Also, there is this open issue which we can support. https://github.com/tdwg/dwc/issues/28
@chicoreus - your suggestion seems reasonable and workable. As discussed under https://github.com/tdwg/dwc/issues/28 a lot of databases have just the type of Type under typeStatus in their databases. I see a good case for us to support the DwC proposal, but in the mean time use the pipe suggestion of @chicoreus
With general agreement, I am changing the Expected Response from
EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:typeStatus is EMPTY; AMENDED the value of dwc:typeStatus if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED
to
EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:typeStatus is EMPTY; AMENDED the value of the first word in each | delimited portion of dwc:typeStatus if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED
and updating Specification Last Updated
I wonder if it should be "value of the first word in the first | delimited portion" rather than "value of the first word in each | delimited portion"
On Sat, 03 Aug 2024 17:07:25 -0700 Arthur Chapman @.***> wrote:
I wonder if it should be "value of the first word in the first | delimited portion" rather than "value of the first word in each | delimited portion"
In each portion, as each portion is expected to be a string in the form {typestatus} of {scientific name} {publicication}.
Some specimens are types for more than one name.
@chicoreus - how do you see the parsing of this with the pipes (|)?
On Sat, 03 Aug 2024 18:22:41 -0700 Arthur Chapman @.***> wrote:
@chicoreus - how do you see the parsing of this with the pipes (|)?
In incomplete pseudocode:
elements = split(typeStatus,'|') for each element in elements { if (first word in element not found in vocabulary) { compliantFlag = false } }
I needed to add "" to pipe in the Expected Response for general interpretation