bdq icon indicating copy to clipboard operation
bdq copied to clipboard

TG2-ISSUE_DATAGENERALIZATIONS_NOTEMPTY

Open iDigBioBot opened this issue 7 years ago • 20 comments

TestField Value
GUID 13d5a10e-188e-40fd-a22c-dbaa87b91df2
Label ISSUE_DATAGENERALIZATIONS_NOTEMPTY
Description Is there a value in dwc:dataGeneralizations?
TestType Issue
Darwin Core Class Record-level
Information Elements ActedUpon dwc:dataGeneralizations
Information Elements Consulted
Expected Response POTENTIAL_ISSUE if dwc:dataGeneralizations is bdq:NotEmpty; otherwise NOT_ISSUE
Data Quality Dimension Resolution
Term-Actions DATAGENERALIZATIONS_NOTEMPTY
Parameter(s)
Source Authority
Specification Last Updated 2023-09-18
Examples [dwc:dataGeneralizations="placed on quarter degree grid": Response.status=RUN_HAS_RESULT, Response.result=POTENTIAL_ISSUE, Response.comment="dwc:dataGeneralizations is bdq:NotEmpty"]
[dwc:dataGeneralizations="": Response.status=RUN_HAS_RESULT, Response.result=NOT_ISSUE, Response.comment="dwc:dataGeneralizations is bdq:Empty"]
Source ALA
References
  • Chapman AD (2020) Current Best Practices for Generalizing Sensitive Species Occurrence Data. Copenhagen: GBIF Secretariat. https://doi.org/10.15468/doc-5jp4-5g10.
  • Chapman AD and Wieczorek JR (2020) Georeferencing Best Practices. Copenhagen: GBIF Secretariat. https://doi.org/10.15468/doc-gg7h-s853
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes This is not specific to spatial data, any value in the dwc:dataGeneralizations field will cause this flag to be raised, but the primary use case is expected to be that dwc:dataGeneralizations demonstrates obfuscated locations.

iDigBioBot avatar Jan 05 '18 15:01 iDigBioBot

Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet: Data generalizations can apply to non-geographic data, maybe the description of this test could be changed to match a more general approach, more in line with the name of the test.

iDigBioBot avatar Jan 05 '18 15:01 iDigBioBot

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: Agreed @PJM - but it is important to include a test for this - Definitely a Core test

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: Perhaps this needs to be split inter several - e.g. LOCALITY_GENERALIZED, DATE_GENERALIZED, ?NAME_GENERALIZED - does that ever happen?

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: I believe that this should be altered to a NOTIFICATION rather than a VALIDATION

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

I can see a case for NAME (higher taxonomy) and SPACE (lower spatial resolution), but do we have any precedent or examples for TIME? I can't think of any.

Tasilee avatar Aug 14 '18 01:08 Tasilee

Maybe someone doesn't want people to know where they were on a particular day? In our data sensitivity document I don't think we considered TIME. But there may be some that don't want it known what days certain animals hatch - turtles on a particular beach, for example. Not sure, but if it is the one tests does it hurt to include?

ArthurChapman avatar Aug 14 '18 02:08 ArthurChapman

I'm ok with leaving TIME in and I'll be interested to hear other comments.

Tasilee avatar Aug 14 '18 02:08 Tasilee

I would keep TIME but I'm not sure about mixing the 3 under the same test. But since there is only 1 Dwc term, maybe that's enough for now.

cgendreau avatar Aug 14 '18 15:08 cgendreau

In this record - the Dimension - should be "Other" and not "Space, Time, Name" - as although it does refer to data generalizations in those three "Dimensions" the actual dwc Element it refers to (dwc:dataGeneralizations falls into the "Other" category. See separate Issue I am posting shortly

ArthurChapman avatar Sep 06 '18 04:09 ArthurChapman

My feeling is that we are using Dimension as a way of summarising the coverage of the tests, so while the term dwc:dataGeneralizations could be considered agnostic based on http://rs.tdwg.org/dwc/terms/#dataGeneralizations, we agreed to specifically allow this test to cover name, space and time. I find that a more useful strategy - and this is an issue for the Darwin Core Maintenance Group to review.

Tasilee avatar Sep 07 '18 00:09 Tasilee

@Tasilee What needs review from the Darwin Core Maintenance Group?

tucotuco avatar Sep 07 '18 04:09 tucotuco

@tucotuco: I was wondering if the recent acceptance of dwc:dataGeneralisations to name, time beyond space would benefit from being more explicit? I also seem to remember that we did have a few (non github) issues from Gainesville for DwC? Chasing...

Tasilee avatar Sep 07 '18 04:09 Tasilee

I think you mean dwc:dataGeneralizations, Lee

ArthurChapman avatar Sep 07 '18 06:09 ArthurChapman

yes, tired. @ArthurChapman do you remember any other issues from TG2 for Dwc?

Tasilee avatar Sep 07 '18 06:09 Tasilee

Not off hand. I am sure @tucotuco made notes in Gainesville of anything relevant to DwC

ArthurChapman avatar Sep 07 '18 06:09 ArthurChapman

Darwin Core does not have these concepts of Dimension, nor is dwc:dataGeneralizations limited to those three data quality concepts. I see the data quality standard as a layer applied on top of Darwin Core (or other standards where the terms are equivalent), and that Darwin Core itself must remain defined independently of that layer.

tucotuco avatar Sep 18 '18 14:09 tucotuco

As per meeting 21st March, NOTIFICATIONs will now be ISSUEs.

Tasilee avatar Mar 21 '22 00:03 Tasilee

After zoom meeting, changed

POTENTIAL_ISSUE if dwc:dataGeneralizations is not EMPTY; otherwise NO_ISSUE

to

POTENTIAL_ISSUE if dwc:dataGeneralizations is not EMPTY; otherwise NOT_ISSUE

Tasilee avatar May 15 '22 22:05 Tasilee

From 1.5 About the tests, their use and specifications (Informative) (Lee)

  1. Value is the returned result for the test, i.e. numeric for measures, a controlled vocabulary (consisting of exactly COMPLIANT or NOT_COMPLIANT) for validations or Issues (NOT_ISSUE, POSSIBLE_ISSUE, ISSUE), either a numeric value or a controlled vocabulary (consisting of exactly COMPLETE or NOT_COMPLETE for Measures, and a data structure (e.g., a list of key value pairs) for proposed changes for Amendments.

Expected Response in this issue:

  • uses POTENTIAL_ISSUE, but the text in standards document is POSSIBLE_ISSUE

ymgan avatar Jan 15 '24 10:01 ymgan

Thanks @ymgan POTENTIAL_ISSUE is correct (see Vocabulary Document at #152 Document needs fixing

ArthurChapman avatar Jan 15 '24 20:01 ArthurChapman