bdq icon indicating copy to clipboard operation
bdq copied to clipboard

TG2-AMENDMENT_BASISOFRECORD_STANDARDIZED

Open iDigBioBot opened this issue 7 years ago • 21 comments

TestField Value
GUID 07c28ace-561a-476e-a9b9-3d5ad6e35933
Label AMENDMENT_BASISOFRECORD_STANDARDIZED
Description Proposes an amendment to the value of dwc:basisOfRecord using the bdq:sourceAuthority.
TestType Amendment
Darwin Core Class Record-level
Information Elements ActedUpon dwc:basisOfRecord
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:basisOfRecord is bdq:Empty; AMENDED the value of dwc:basisOfRecord if it could be unambiguously interpreted as a value in the bdq:sourceAuthority; otherwise NOT_AMENDED
Data Quality Dimension Conformance
Term-Actions BASISOFRECORD_STANDARDIZED
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "Darwin Core basisOfRecord" {[https://dwc.tdwg.org/terms/#dwc:basisOfRecord]} {dwc:basisOfRecord vocabulary [https://rs.gbif.org/vocabulary/dwc/basis_of_record.xml]}
Specification Last Updated 2024-07-24
Examples [dwc:basisOfRecord="Human obs": Response.status=AMENDED, Response.result=dwc:basisOfRecord="HumanObservation", Response.comment="dwc:basisOfRecord contains interpretable value"]
[dwc:basisOfRecord="FossilSpecimen": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:basisOfRecord contains match in the bdq:sourceAuthority so NOT_AMENDED"]
Source VertNet
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes The term dwc:basisOfRecord has the comment "Recommended best practice is to use a controlled vocabulary such as the set of local names of the identifiers for classes in Darwin Core." The list of these values can be determined by searching https://github.com/tdwg/dwc/blob/master/vocabulary/term_versions.csv for rows with status="recommended" and rdf_type="http://www.w3.org/2000/01/rdf-schema#Class". For example, the term http://rs.tdwg.org/dwc/terms/PreservedSpecimen has a local name PreservedSpecimen. For tests against a dwc:Occurrence record, the set of valid terms is more limited and embodied in the resource found at https://rs.gbif.org/vocabulary/dwc/basis_of_record.xml, which contains the local name for the identifier, as well as preferred and alternate labels from which to standardize values.

iDigBioBot avatar Jan 05 '18 15:01 iDigBioBot

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: Should follow on after Line 57

iDigBioBot avatar Jan 05 '18 15:01 iDigBioBot

What is the case if an Institution has all its collection as one type of "dwc:basisOfRecord" (like everything is a "FossilSpecimen"). Is there a case then that if the filed is EMPTY it can be populated from the source authority that might just have one value for that institution that is "FossilSpecimen"? Thus we would lkeave EMPTY out of INTERNAL_PREREQUISITES_NOT_MET

ArthurChapman avatar May 06 '19 02:05 ArthurChapman

I would be a hard-ass. If every row is of the same type, it is trivial to provide the value. This is a record-level test, and we can not rely on metadata to get the information.

tucotuco avatar May 11 '19 01:05 tucotuco

I wan't thinking of using metadata, but looking at an example where an institution was running the tests and could set there Parameter as just being one value. Otherwise why is it Parameterized? But I am happy either way.

ArthurChapman avatar May 11 '19 02:05 ArthurChapman

It is currently parametrized to provide a source authority against which to check.

tucotuco avatar May 11 '19 12:05 tucotuco

We have two levels related to 'source authority' - the authority itself (Parameter required) and the terms it contains (VOCABULARY)?

Except for #75, all tests that have 'VOCABULARY', also have 'Parameterized' VOCABULARY is either Darwin Core - that I'd call internal as the tests have this as a foundation, or an external authority. Maybe we, like the full specifications of the Expected responses for annotations even if they have a corresponding validation, need to be explicit. That is we need to specify Darwin Core as the source authority where relevant?

Am I rambling? It wouldn't be the first time.

Tasilee avatar May 12 '19 00:05 Tasilee

I do not see that issue #75 is or ever was parametrized.

Yes, the tests are designed to be used against concepts that match the definitions of the Darwin Core terms they reference, and so we should not have Darwin Core as an authority in any of our extant tests. However, "vocabularies of values" designed for use with Darwin Core (or indeed recommended to be used from the Darwin Core side) are not Darwin Core. I would say that these authorities always should be parametrized to decouple the tests from content that is much more mutable over time than the definitions of the Darwin Core terms.

tucotuco avatar May 13 '19 15:05 tucotuco

Changed Source Authority from

bdq:sourceAuthority default = "Darwin Core Terms" [https://dwc.tdwg.org/terms/#dwc:basisOfRecord]

to

bdq:sourceAuthority default = {Darwin Core} {Basis of record [https://dwc.tdwg.org/terms/#dwc:basisOfRecord] }

and removed bdq:sourceAuthority from Parameters (I presume, as there is no alternative vocab)?

Tasilee avatar Jun 30 '23 04:06 Tasilee

Amended Source Authority values to align with @chicoreus syntax

bdq:sourceAuthority default = {Darwin Core} {Basis of record [https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

to

bdq:sourceAuthority default = "Darwin Core dwc:basisOfRecord" {[https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

Tasilee avatar Jul 04 '23 23:07 Tasilee

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "Darwin Core dwc:basisOfRecord" {[https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

to

bdq:sourceAuthority default = "Darwin Core" {https://dwc.tdwg.org/} {dwc:basisOfRecord [https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

Tasilee avatar Jul 11 '23 01:07 Tasilee

Due to recent discussions, changed Source Authority from

bdq:sourceAuthority default = "Darwin Core" {[https://dwc.tdwg.org/]} {dwc:basisOfRecord [https://dwc.tdwg.org/terms/#dwc:basisOfRecord]}

to

bdq:sourceAuthority default = "Darwin Core basisOfRecord" {[https://dwc.tdwg.org/terms/#dwc:basisOfRecord]} {Basis of record vocabulary [https://rs.gbif.org/vocabulary/dwc/basis_of_record.xml]}

Notes by @tucotuco required

Tasilee avatar Jul 16 '23 23:07 Tasilee

I missed the Parameter(s) (added) and the syntax on the vocabulary in Source Authority (done)

Tasilee avatar Jul 17 '23 00:07 Tasilee

Updated comment from blank to

"The term dwc:basisOfRecord has the comment "Recommended best practice is to use the standard label of one of the Darwin Core classes." The list of these values can be determined by searching https://github.com/tdwg/dwc/blob/master/vocabulary/term_versions.csv for rows with status="recommended" and rdf_type="http://www.w3.org/2000/01/rdf-schema#Class". For tests against a dwc:Occurrence record, the set of valid terms is more limited and embodied in the resource found at https://rs.gbif.org/vocabulary/dwc/basis_of_record.xml, which contains both preferred labels and alternate labels from which to standardize values. This test will fail if there is leading or trailing whitespace or there are leading or trailing non-printing characters."

tucotuco avatar Jul 17 '23 01:07 tucotuco

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

Tasilee avatar Sep 18 '23 00:09 Tasilee

Updated note to remove evident copy/paste error of fail on whitespace text. Leading or trailing whitespace is one condition this amendment should be able to propose a correction for.

chicoreus avatar Feb 23 '24 20:02 chicoreus

Note that the labels contain spaces, e.g. "Preserved Specimen", not "PreservedSpecimen".

Updating the examples from:

[dwc:basisOfRecord="Human obs": Response.status=AMENDED, Response.result=dwc:basisOfRecord="HumanObservation", Response.comment="dwc:basisOfRecord contains interpretable value"]

[dwc:basisOfRecord="FossilSpecimen": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:basisOfRecord contains match in bdq:sourceAuthority so NOT_AMENDED"]

to

[dwc:basisOfRecord="Human obs": Response.status=AMENDED, Response.result=dwc:basisOfRecord="Human Observation", Response.comment="dwc:basisOfRecord contains interpretable value"]

[dwc:basisOfRecord="Fossil Specimen": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:basisOfRecord contains match in bdq:sourceAuthority so NOT_AMENDED"]

Validation data for dataID rows 438, 439, 440, 441, 442, 443, 444, 445, and 446 need to be examined, and at least 443-446 need to be corrected to reflect spaces in the labels.

chicoreus avatar Jul 24 '24 12:07 chicoreus

Added an example to the note.

Needs Work label currently applies to the validation data rather than the specification.

chicoreus avatar Jul 24 '24 13:07 chicoreus

I don't agree with this one. The term names are the standard (HumanObservation), not their labels. From https://dwc.tdwg.org/terms/#dwc:basisOfRecord: "Recommended best practice is to use a controlled vocabulary such as the set of local names of the identifiers for classes in Darwin Core." Examples: HumanObservation

tucotuco avatar Jul 24 '24 13:07 tucotuco

@tucotuco Good. I like local names better. Feels like it fits better with more people's practices. Looks like the Darwin Core term recommendation for best practice has changed. On July 16, 2023, you had added the note with the text: "The term dwc:basisOfRecord has the comment "Recommended best practice is to use the standard label of one of the Darwin Core classes."

I'd be very in favor of changing the test note and examples and keeping the validation data with the local names (without spaces).

chicoreus avatar Jul 24 '24 15:07 chicoreus

Updated comment and examples accordingly.

chicoreus avatar Jul 24 '24 15:07 chicoreus

I have changed the relevant Test Data records and added a new one. Is NEEDS WORK still needed on this?

Tasilee avatar Aug 02 '24 00:08 Tasilee