TG2-ISSUE_ANNOTATION_NOTEMPTY
| TestField | Value |
|---|---|
| GUID | fecaa8a3-bbd8-4c5a-a424-13c37c4bb7b1 |
| Label | ISSUE_ANNOTATION_NOTEMPTY |
| Description | Are there any annotations associated with the record? |
| TestType | Issue |
| Darwin Core Class | oa:target |
| Information Elements ActedUpon | |
| Information Elements Consulted | AllDarwinCoreTerms |
| Expected Response | EXTERNAL_PREREQUISITES_NOT_MET if the bdq:annotationSystem is not available; POTENTIAL_ISSUE if an annotation in the bdq:annotationSystem exists with a matching bdq:annotationAlertIf; otherwise NOT_ISSUE. |
| Data Quality Dimension | Reliability |
| Term-Actions | ANNOTATION_NOTEMPTY |
| Parameter(s) | bdq:annotationSystem |
| bdq:annotationAlertIf | |
| Source Authority | bdq:annotationSystem default = "W3C Web Annotation" {[https://www.w3.org/annotation/]} {"oa:Annotation vocabulary" {[https://www.w3.org/TR/annotation-vocab/]} |
| bdq:annotationAlertIf default = "oa:Annotation with oa:hasTarget having as object any dwciri:term instance that is part of the SingleRecord under test." {[https://www.w3.org/TR/annotation-vocab/]} | |
| Specification Last Updated | 2023-09-18 |
| Examples | [bdq:annotationAlertIf="": Response.status=RUN_HAS_RESULT, Response.result=NOT_ISSUE, Response.comment="bdq:annotationAlertIf is bdq:Empty"] |
| [bdq:annotationAlertIf="?": Response.status=RUN_HAS_RESULT, Response.result=POTENTIAL_ISSUE, Response.comment="bdq:annotationAlertIf is bdq:NotEmpty"] | |
| Source | ALA, Lee Belbin |
| References |
|
| Example Implementations (Mechanisms) | |
| Link to Specification Source Code | |
| Notes | While there is a W3C standard on 'web annotation', there is no TDWG recommendation on how this standard could be applied to annotating Darwin Core records. While implementation of this test is currently problematic, TG2 considers annotations attached to any aspect of a Darwin Core record justifies this test as a placeholder in the hope of future developments. |
Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: Paul reminded me that assertions are annotations that reminded me that we needed a flag for any user annotations
Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet: I see two issues here: 1. annotations are not part of the record are are aggregator-dependent, and 2. the annotation does not necessarily mean sth wrong was found in the record, it might be a corroboration of what's in the record, and hence it'd be tricky to assess data quality from a "has annot" approach.
Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: Worth noting that we are not just flagging something that is "wrong" but warning users that there is something there they may wish to take into account in deciding what records to use. Also important for transferring from one aggregator to another and feed back to custodians of the data
Comment by Paul Morris (@chicoreus) migrated from spreadsheet: Feels like a plausible measure: the annotation store(s) of which the mechanism are aware contain one or more annotations where the occurrenceID is a subject of the annotation. Different mechanisms or the same mechanism with different configurations will produce different results.
Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: This is a worthwhile Annotation - but probably not a test that leads to an assertion. I am not sure how best to handle this in the terms of the tests and assertions
Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: Sorry guys but this one IS MANDATORY - we are flagging if anyone has made an annotation about the record and we therefore have an obligation to expose that to all
To align with the response from other notifications, suggest changing to TG2-NOTIFICATION_ANNOTATION_NOTEMPTY
Help with thoughts around implementation please. Would I include an annotations property in a dataset to record any annotations, such as confirmations of out of range species records? What would that property be called? dataAnnotations, dataQualityAnnotations? Many thanks.
First up, all the tests/assertions are at the record-level. You simply accumulate record-level responses for any dataset. Regarding annotations, see https://github.com/tdwg/bdq/issues/142 and https://github.com/tdwg/bdq/issues/149 but we do have a problem right now as @chicoreus has taken over chairing the TAG leaving a leader for the Annotations Interest Group vacant (https://www.tdwg.org/community/annotations/). I believe there was general agreement that TDWG would follow https://www.w3.org/annotation/.
The push for this 'assertion' comes from me given my experience with https://www.ala.org.au/uncategorised/annotations-alerts-about-new-annotations-and-annotations-of-interest/. My point was (as noted above) - if a user has 'flagged' something about a record, I figure you would want to know about it (and any subsequent 'actions'). The hassle is of course, that there is no standard on how annotations are handled. In the ALA's case, they are stripped off records when data is sent to GBIF.
@chicoreus : Any comment?
@Tasilee I have copied this to the Annotations GitHub (https://github.com/tdwg/annotations/issues/4#). This issue on the Annotations GitHub (Compile information on Feedback Mechanisms for Data Quality Interest Group) was raised by @chicoreus is to deal with the TG2 issues
Thanks @ArthurChapman
I've started to look at testdata for this NOTIFICATION and have made some changes/assumptions. I changed LABEL "PRESENT" to match title of test ("NOTEMPTY") and wonder about
- The validity of Information elements "AllDarinCoreTerms". I would probably now opt for something like bdq:annotation and add it to the vocabulary. We are a step ahead of the relevant standard but I can't see why we shouldn't be more explicit here.
- I'd opt for "NOTIFY if ..." rather than "REPORT if ..." in Expected response to match the title.
After zoom meeting, changed
POTENTIAL_ISSUE if annotations are not EMPTY; otherwise NO_ISSUE
to
POTENTIAL_ISSUE if annotations are not EMPTY; otherwise NOT_ISSUE
Are we really meaning "bdq:annotation" here? It obviously isn't "dwc:annotation" :).
@ArthurChapman and I are wondering why we would not have some Notes about this. I certainly understand the intent: If there is any annotation associated with the record, raise a flag....
Thinking about this and @chicoreus can comment on implementation - but I am not sure how this test would be implemented. In the Test data we have things like "bdq:annotation="anyOldTerm"" - but none of the databases we are checking will have a field called "bdq:annotation" will it? So what are we actually checking. I know what we are trying to do - but how does in work in practice. If the ALA adds an annotation, say on dwc:scientificName - it will be called something in the ALA database, but it WON'T be "bdq:annotation". Am I missing something here?
On Thu, 08 Jun 2023 15:06:46 -0700 Arthur Chapman @.***> wrote:
Thinking about this and @chicoreus can comment on implementation - but I am not sure how this test would be implemented. In the Test data we have things like "bdq:annotation="anyOldTerm"" - but none of the databases we are checking will have a field called "bdq:annotation" will it? So what are we actually checking.
Given Darwin Core data in a triple store, does a w3c annotation exist where the occurrenceID of the record under evaluation is found as the target of the annotation....
Thus not likely to be able to be implemented for most settings Darwin Core data is found in.
Let's discuss this on the next zoom (it has NEEDS WORK tag). My 5c worth is that we need to leave this test as CORE to encourage (prod?) Darwin Core and the Annotations Interest Group to figure how this test can be implemented. My experience with the ALA data (that has associated record annotations) strongly supports its value.
As above - this test has nothing to search for, and the examples are searching for "bdq:annotation" which won't be in any databases. The only way I can see this test working at the moment (after some rewording) is for the test to be Parameterized with the default parameter being the w3c "oa:annotation" - if field includes something - then it returns POTENTIAL_ISSUE, but if it is EMPTY of not in the database then it would return NOT_ISSUE. Individual implementers not using w3c would add in whatever their data model uses for annotations. Gradually over time (especially if the TDWG Annotations IG gets going again) more and more databases may begin to follow the W3C Annotation Data Model (https://www.w3.org/TR/2014/WD-annotation-model-20141211/)
I have added a definition of "bdq:annotation" in the Vocabulary (#152) which would then fit with my suggestion above of making the test Parameterized
"Optionally establishes if an Annotation exists in a Parameterized Test (q.v.) with the default being the w3c Annotations Data Model's "oa:annotation" | Parameter | Used in test "ANNOTATION_ISSUE_NOTEMPTY" (fecaa8a3-bbd8-4c5a-a424-13c37c4bb7b1)."
Three cases to consider for what is needed in a parameter to identify relevant annotations:
oa:annotation where the target is the dwc:occurrenceID
dwc:resourceRelationship where the darwin core object has a resource relationship of a type that expresses a relationship to an annotation.
ALA annotation related to the occurrence.
If we can work out how to parameterize these three cases we can probably generalize.
I'm a bit worried about Occurrence as a target of an Annotation. Is our target really limited to that? Some of our tests can be applied perfectly well to Events and Taxa.
I agree @tucotuco - the original aim was to look for an annotation anywhere - not easy thing to do if institutions aren't using the w3c annotations.
@tucotuco I concur. Typical CORE case is flat darwin core, but can be structured, and oa:target that has as its object any dwciri term that is part of the current SingleRecord (here the generality of the framework helps us, as we don't have to define what a SingleRecord looks like) is applicable. That could be phrased: there exists an oa:annotation where the target is any dwciri term found within the SingleRecord. That might get us on a path to a generalization: there exists an annotation that has a relationship to a term in the SingleRecord.
So what do we do? Expected response? Notes? Vocab?
POTENTIAL_ISSUE if an oa:annotation is not EMPTY; otherwise NOT_ISSUE ?
Let's try two parameters: bdq;annotationSystem and bdq:annotationCriteria
Then an expected response:
POTENTIAL_ISSUE if there exists an annotation in bdq:annotationSystem where bdq:annotationCriteria are met for the SingleRecord under test. otherwise, NOT_ISSUE;
A value for bdq;annotationSystem = oa:Annotation
A value for bdq:annotationCriteria = an annotation has oa:target as as its object any dwciri term found within the current SingleRecord.
We could probably phrase a reference to bdq:annotationSystem for ALA, and a similar bdq;annotationCriteria that operates to select relevant ALA annotations.
Don't think the phrasing is right, but this might be a starting point for working out a generalization.
I'd prefer not mentioning singleRecords in the Expected Response - we don't do it elsewhere, and for just this one test it would be good not to have >1 namespace term for the test if possible. Try this for a possible solution
As I suggested earlier if the test is Parameterized with bdq:annotation default = oa:annotation
POTENTIAL_ISSUE if bdq:annotation is not EMPTY indicating that an annotation exists in association with any dwciri term in the record; otherwise NOT_ISSUE
I like @ArthurChapman 's reasoning and suggestions. We would need to add dwciri reference/link and a concise Note on why we are doing this, and how we would at least like to see it work. As I am the 'pusher' on this 'test', I'm happy to write a sentence or two of plain English, but it will need @chicoreus and others to edit it into something else :)
BTW, I am still chasing ALA's current handling of annotations.
Perhaps we could get away from dwciri reference by saying
POTENTIAL_ISSUE if bdq:annotation is not EMPTY indicating that an annotation exists in association with any Darwin Core term in the record; otherwise NOT_ISSUE
If we cite dwciri, @tucotuco should be able to supply the best reference. Slide 8 in https://downloads.ctfassets.net/uo17ejk9rkwj/3Fg2Keqao8sMgKmwa0QWkC/1c8adb3f0658909cbaab2bb386361fca/Event_core_and_new_data_types_in_GBIF.pdf is worth looking at.
"The dwciri: namespace is intended for use exclusively with non-literal objects".
Is this something that should be added to the Vocabulary - or is just part of the Darwin Core system.
@ArthurChapman "record" is something that makes sense in terms of flat darwin core, but has no meaning within the terms of the framework, and here, where we are talking about annotations and OA, we are talking semantic web, not flat objects, and the boundary of a "record" is undefined. We need here to talk about the framework concept SingleRecord, all the other tests are doing so, as their scope within the framework is explicitly SingleRecord, not MultiRecord, but here, unlike other SingleRecord tests, we do need to mention the SingleRecord to be able to ask the question whether some annotation is about the SingleRecord at hand (which could be a flat darwin core record, or a structured star schema set of records in several tables, or an RDF graph), not about some other SingleRecord in the MultiRecord under evaluation. The framework divides the world into these two things as a specific generalization to avoid being limited to particular data structures.
@ArthurChapman we do not want to get away from referring to the dwciri namespace, the dwciri namespace is the thing that makes it easy for us to define if an oa annotation has a relationship to the current SingleRecord. If we don't reference dwciri, we will have a very difficult time listing which darwin core term in which data structures may represent boundaries.
@ArthurChapman the reference to use for dwciri is https://dwc.tdwg.org/rdf/