bdq icon indicating copy to clipboard operation
bdq copied to clipboard

TG2-ISSUE_ANNOTATION_NOTEMPTY

Open iDigBioBot opened this issue 7 years ago • 60 comments

TestField Value
GUID fecaa8a3-bbd8-4c5a-a424-13c37c4bb7b1
Label ISSUE_ANNOTATION_NOTEMPTY
Description Are there any annotations associated with the record?
TestType Issue
Darwin Core Class oa:target
Information Elements ActedUpon
Information Elements Consulted AllDarwinCoreTerms
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:annotationSystem is not available; POTENTIAL_ISSUE if an annotation in the bdq:annotationSystem exists with a matching bdq:annotationAlertIf; otherwise NOT_ISSUE.
Data Quality Dimension Reliability
Term-Actions ANNOTATION_NOTEMPTY
Parameter(s) bdq:annotationSystem
bdq:annotationAlertIf
Source Authority bdq:annotationSystem default = "W3C Web Annotation" {[https://www.w3.org/annotation/]} {"oa:Annotation vocabulary" {[https://www.w3.org/TR/annotation-vocab/]}
bdq:annotationAlertIf default = "oa:Annotation with oa:hasTarget having as object any dwciri:term instance that is part of the SingleRecord under test." {[https://www.w3.org/TR/annotation-vocab/]}
Specification Last Updated 2023-09-18
Examples [bdq:annotationAlertIf="": Response.status=RUN_HAS_RESULT, Response.result=NOT_ISSUE, Response.comment="bdq:annotationAlertIf is bdq:Empty"]
[bdq:annotationAlertIf="?": Response.status=RUN_HAS_RESULT, Response.result=POTENTIAL_ISSUE, Response.comment="bdq:annotationAlertIf is bdq:NotEmpty"]
Source ALA, Lee Belbin
References
  • W3C (2017) Web Annotation Data Model. https://www.w3.org/TR/annotation-model/
  • W3C (2017) Web Annotation Data Model: Annotation. https://www.w3.org/TR/annotation-vocab/#annotation
  • Biodiversity Information Standards (TDWG) (n.dat) Annotations Interest Group. https://www.tdwg.org/community/annotations/
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes While there is a W3C standard on 'web annotation', there is no TDWG recommendation on how this standard could be applied to annotating Darwin Core records. While implementation of this test is currently problematic, TG2 considers annotations attached to any aspect of a Darwin Core record justifies this test as a placeholder in the hope of future developments.

iDigBioBot avatar Jan 05 '18 15:01 iDigBioBot

Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: Paul reminded me that assertions are annotations that reminded me that we needed a flag for any user annotations

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet: I see two issues here: 1. annotations are not part of the record are are aggregator-dependent, and 2. the annotation does not necessarily mean sth wrong was found in the record, it might be a corroboration of what's in the record, and hence it'd be tricky to assess data quality from a "has annot" approach.

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: Worth noting that we are not just flagging something that is "wrong" but warning users that there is something there they may wish to take into account in deciding what records to use. Also important for transferring from one aggregator to another and feed back to custodians of the data

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

Comment by Paul Morris (@chicoreus) migrated from spreadsheet: Feels like a plausible measure: the annotation store(s) of which the mechanism are aware contain one or more annotations where the occurrenceID is a subject of the annotation. Different mechanisms or the same mechanism with different configurations will produce different results.

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: This is a worthwhile Annotation - but probably not a test that leads to an assertion. I am not sure how best to handle this in the terms of the tests and assertions

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: Sorry guys but this one IS MANDATORY - we are flagging if anyone has made an annotation about the record and we therefore have an obligation to expose that to all

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

To align with the response from other notifications, suggest changing to TG2-NOTIFICATION_ANNOTATION_NOTEMPTY

Tasilee avatar Feb 11 '18 23:02 Tasilee

Help with thoughts around implementation please. Would I include an annotations property in a dataset to record any annotations, such as confirmations of out of range species records? What would that property be called? dataAnnotations, dataQualityAnnotations? Many thanks.

ianengelbrecht avatar Mar 12 '19 11:03 ianengelbrecht

First up, all the tests/assertions are at the record-level. You simply accumulate record-level responses for any dataset. Regarding annotations, see https://github.com/tdwg/bdq/issues/142 and https://github.com/tdwg/bdq/issues/149 but we do have a problem right now as @chicoreus has taken over chairing the TAG leaving a leader for the Annotations Interest Group vacant (https://www.tdwg.org/community/annotations/). I believe there was general agreement that TDWG would follow https://www.w3.org/annotation/.

The push for this 'assertion' comes from me given my experience with https://www.ala.org.au/uncategorised/annotations-alerts-about-new-annotations-and-annotations-of-interest/. My point was (as noted above) - if a user has 'flagged' something about a record, I figure you would want to know about it (and any subsequent 'actions'). The hassle is of course, that there is no standard on how annotations are handled. In the ALA's case, they are stripped off records when data is sent to GBIF.

@chicoreus : Any comment?

Tasilee avatar Mar 14 '19 04:03 Tasilee

@Tasilee I have copied this to the Annotations GitHub (https://github.com/tdwg/annotations/issues/4#). This issue on the Annotations GitHub (Compile information on Feedback Mechanisms for Data Quality Interest Group) was raised by @chicoreus is to deal with the TG2 issues

ArthurChapman avatar Mar 27 '19 13:03 ArthurChapman

Thanks @ArthurChapman

Tasilee avatar Mar 27 '19 20:03 Tasilee

I've started to look at testdata for this NOTIFICATION and have made some changes/assumptions. I changed LABEL "PRESENT" to match title of test ("NOTEMPTY") and wonder about

  1. The validity of Information elements "AllDarinCoreTerms". I would probably now opt for something like bdq:annotation and add it to the vocabulary. We are a step ahead of the relevant standard but I can't see why we shouldn't be more explicit here.
  2. I'd opt for "NOTIFY if ..." rather than "REPORT if ..." in Expected response to match the title.

Tasilee avatar Oct 06 '20 22:10 Tasilee

After zoom meeting, changed

POTENTIAL_ISSUE if annotations are not EMPTY; otherwise NO_ISSUE

to

POTENTIAL_ISSUE if annotations are not EMPTY; otherwise NOT_ISSUE

Tasilee avatar May 15 '22 22:05 Tasilee

Are we really meaning "bdq:annotation" here? It obviously isn't "dwc:annotation" :).

@ArthurChapman and I are wondering why we would not have some Notes about this. I certainly understand the intent: If there is any annotation associated with the record, raise a flag....

Tasilee avatar Jun 08 '23 05:06 Tasilee

Thinking about this and @chicoreus can comment on implementation - but I am not sure how this test would be implemented. In the Test data we have things like "bdq:annotation="anyOldTerm"" - but none of the databases we are checking will have a field called "bdq:annotation" will it? So what are we actually checking. I know what we are trying to do - but how does in work in practice. If the ALA adds an annotation, say on dwc:scientificName - it will be called something in the ALA database, but it WON'T be "bdq:annotation". Am I missing something here?

ArthurChapman avatar Jun 08 '23 22:06 ArthurChapman

On Thu, 08 Jun 2023 15:06:46 -0700 Arthur Chapman @.***> wrote:

Thinking about this and @chicoreus can comment on implementation - but I am not sure how this test would be implemented. In the Test data we have things like "bdq:annotation="anyOldTerm"" - but none of the databases we are checking will have a field called "bdq:annotation" will it? So what are we actually checking.

Given Darwin Core data in a triple store, does a w3c annotation exist where the occurrenceID of the record under evaluation is found as the target of the annotation....

Thus not likely to be able to be implemented for most settings Darwin Core data is found in.

chicoreus avatar Jun 08 '23 22:06 chicoreus

Let's discuss this on the next zoom (it has NEEDS WORK tag). My 5c worth is that we need to leave this test as CORE to encourage (prod?) Darwin Core and the Annotations Interest Group to figure how this test can be implemented. My experience with the ALA data (that has associated record annotations) strongly supports its value.

Tasilee avatar Jun 08 '23 22:06 Tasilee

As above - this test has nothing to search for, and the examples are searching for "bdq:annotation" which won't be in any databases. The only way I can see this test working at the moment (after some rewording) is for the test to be Parameterized with the default parameter being the w3c "oa:annotation" - if field includes something - then it returns POTENTIAL_ISSUE, but if it is EMPTY of not in the database then it would return NOT_ISSUE. Individual implementers not using w3c would add in whatever their data model uses for annotations. Gradually over time (especially if the TDWG Annotations IG gets going again) more and more databases may begin to follow the W3C Annotation Data Model (https://www.w3.org/TR/2014/WD-annotation-model-20141211/)

ArthurChapman avatar Jun 08 '23 23:06 ArthurChapman

I have added a definition of "bdq:annotation" in the Vocabulary (#152) which would then fit with my suggestion above of making the test Parameterized

"Optionally establishes if an Annotation exists in a Parameterized Test (q.v.) with the default being the w3c Annotations Data Model's "oa:annotation" | Parameter | Used in test "ANNOTATION_ISSUE_NOTEMPTY" (fecaa8a3-bbd8-4c5a-a424-13c37c4bb7b1)."

ArthurChapman avatar Jun 08 '23 23:06 ArthurChapman

Three cases to consider for what is needed in a parameter to identify relevant annotations:

oa:annotation where the target is the dwc:occurrenceID

dwc:resourceRelationship where the darwin core object has a resource relationship of a type that expresses a relationship to an annotation.

ALA annotation related to the occurrence.

If we can work out how to parameterize these three cases we can probably generalize.

chicoreus avatar Jun 12 '23 21:06 chicoreus

I'm a bit worried about Occurrence as a target of an Annotation. Is our target really limited to that? Some of our tests can be applied perfectly well to Events and Taxa.

tucotuco avatar Jun 14 '23 21:06 tucotuco

I agree @tucotuco - the original aim was to look for an annotation anywhere - not easy thing to do if institutions aren't using the w3c annotations.

ArthurChapman avatar Jun 14 '23 21:06 ArthurChapman

@tucotuco I concur. Typical CORE case is flat darwin core, but can be structured, and oa:target that has as its object any dwciri term that is part of the current SingleRecord (here the generality of the framework helps us, as we don't have to define what a SingleRecord looks like) is applicable. That could be phrased: there exists an oa:annotation where the target is any dwciri term found within the SingleRecord. That might get us on a path to a generalization: there exists an annotation that has a relationship to a term in the SingleRecord.

chicoreus avatar Jun 17 '23 02:06 chicoreus

So what do we do? Expected response? Notes? Vocab?

POTENTIAL_ISSUE if an oa:annotation is not EMPTY; otherwise NOT_ISSUE ?

Tasilee avatar Jun 18 '23 03:06 Tasilee

Let's try two parameters: bdq;annotationSystem and bdq:annotationCriteria

Then an expected response:

POTENTIAL_ISSUE if there exists an annotation in bdq:annotationSystem where bdq:annotationCriteria are met for the SingleRecord under test. otherwise, NOT_ISSUE;

A value for bdq;annotationSystem = oa:Annotation

A value for bdq:annotationCriteria = an annotation has oa:target as as its object any dwciri term found within the current SingleRecord.

We could probably phrase a reference to bdq:annotationSystem for ALA, and a similar bdq;annotationCriteria that operates to select relevant ALA annotations.

Don't think the phrasing is right, but this might be a starting point for working out a generalization.

chicoreus avatar Jun 18 '23 13:06 chicoreus

I'd prefer not mentioning singleRecords in the Expected Response - we don't do it elsewhere, and for just this one test it would be good not to have >1 namespace term for the test if possible. Try this for a possible solution

As I suggested earlier if the test is Parameterized with bdq:annotation default = oa:annotation

POTENTIAL_ISSUE if bdq:annotation is not EMPTY indicating that an annotation exists in association with any dwciri term in the record; otherwise NOT_ISSUE

ArthurChapman avatar Jun 18 '23 21:06 ArthurChapman

I like @ArthurChapman 's reasoning and suggestions. We would need to add dwciri reference/link and a concise Note on why we are doing this, and how we would at least like to see it work. As I am the 'pusher' on this 'test', I'm happy to write a sentence or two of plain English, but it will need @chicoreus and others to edit it into something else :)

BTW, I am still chasing ALA's current handling of annotations.

Tasilee avatar Jun 18 '23 22:06 Tasilee

Perhaps we could get away from dwciri reference by saying

POTENTIAL_ISSUE if bdq:annotation is not EMPTY indicating that an annotation exists in association with any Darwin Core term in the record; otherwise NOT_ISSUE

If we cite dwciri, @tucotuco should be able to supply the best reference. Slide 8 in https://downloads.ctfassets.net/uo17ejk9rkwj/3Fg2Keqao8sMgKmwa0QWkC/1c8adb3f0658909cbaab2bb386361fca/Event_core_and_new_data_types_in_GBIF.pdf is worth looking at.

"The dwciri: namespace is intended for use exclusively with non-literal objects".

Is this something that should be added to the Vocabulary - or is just part of the Darwin Core system.

ArthurChapman avatar Jun 18 '23 22:06 ArthurChapman

@ArthurChapman "record" is something that makes sense in terms of flat darwin core, but has no meaning within the terms of the framework, and here, where we are talking about annotations and OA, we are talking semantic web, not flat objects, and the boundary of a "record" is undefined. We need here to talk about the framework concept SingleRecord, all the other tests are doing so, as their scope within the framework is explicitly SingleRecord, not MultiRecord, but here, unlike other SingleRecord tests, we do need to mention the SingleRecord to be able to ask the question whether some annotation is about the SingleRecord at hand (which could be a flat darwin core record, or a structured star schema set of records in several tables, or an RDF graph), not about some other SingleRecord in the MultiRecord under evaluation. The framework divides the world into these two things as a specific generalization to avoid being limited to particular data structures.

@ArthurChapman we do not want to get away from referring to the dwciri namespace, the dwciri namespace is the thing that makes it easy for us to define if an oa annotation has a relationship to the current SingleRecord. If we don't reference dwciri, we will have a very difficult time listing which darwin core term in which data structures may represent boundaries.

chicoreus avatar Jun 18 '23 23:06 chicoreus

@ArthurChapman the reference to use for dwciri is https://dwc.tdwg.org/rdf/

chicoreus avatar Jun 18 '23 23:06 chicoreus