bdq icon indicating copy to clipboard operation
bdq copied to clipboard

TG2-VALIDATION_POLYNOMIAL_CONSISTENT

Open iDigBioBot opened this issue 7 years ago • 32 comments

TestField Value
GUID 17f03f1f-f74d-40c0-8071-2927cfc9487b
Label VALIDATION_POLYNOMIAL_CONSISTENT
Description Is the polynomial represented in dwc:scientificName consistent with the equivalent values in dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet?
TestType Validation
Darwin Core Class dwc:Taxon
Information Elements ActedUpon dwc:scientificName
dwc:genericName
dwc:specificEpithet
dwc:infraspecificEpithet
Information Elements Consulted
Expected Response INTERNAL_PREREQUISITES_NOT_MET if dwc:scientificName is bdq:Empty, or all of dwc:genericName, dwc:specificEpithet and dwc:infraspecificEpithet are bdq:Empty; COMPLIANT if the polynomial, as represented in dwc:scientificName, is consistent with bdq:NotEmpty values of dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet; otherwise NOT_COMPLIANT.
Data Quality Dimension Consistency
Term-Actions POLYNOMIAL_CONSISTENT
Parameter(s)
Source Authority
Specification Last Updated 2023-09-18
Examples [dwc:scientificName="Hakea decurrens ssp. physocarpa", dwc:genericName="", dwc:specificEpithet="decurrens", dwc:infraspecificEpithet="physocarpa": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="Values of all non-empty atomic terms are found in the polynomial"]
[dwc:scientificName="Hakea decurrens", dwc:genericName="Hakea", dwc:specificEpithet="decurrens", dwc:infraspecificEpithet="physocarpa": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:scientificName is inconsistent with atomic parts (dwc:genus, dwc:specificEpithet and dwc:infraspecificEpithet)"]
Source Paula Zermoglio
References
  • GBIF Secretariat (2023) GBIF Backbone Taxonomy. Checklist dataset. https://doi.org/10.15468/39omei
Example Implementations (Mechanisms) Kurator/FilteredPush sci_name_qc Library, FP-Akka
Link to Specification Source Code https://github.com/FilteredPush/sci_name_qc/blob/v1.1.2/src/main/java/org/filteredpush/qc/sciname/DwCSciNameDQ.java#L1554
Notes If dwc:specificEpithet is populated then this test expects that the value dwc:specificEpithet is the name of the second or species epithet of the scientificName. If dwc:genericName is populated, this test expects that the value of dwc:genus is the first word of the value of dwc:scientificName. If dwc:specificEpithet is populated then this test expects that the value dwc:specificEpithet is the name of the first or species epithet of the scientificName. If dwc:infraspecificEpithet is populated, then this test expects that the value of dwc:infraspecificEpithet is the name of the lowest or terminal infraspecific epithet of the scientificName, excluding any rank designation.

iDigBioBot avatar Jan 05 '18 16:01 iDigBioBot

See Positive description - do we need to add "scientificNameAuthorship" to fields?

ArthurChapman avatar Jan 05 '18 22:01 ArthurChapman

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: Variable name would need changing as this relates to the Positive side of the test rather than the negative. Also the Description appears for the (test - PASS) column (currently hidden)

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

Comment by Paul Morris (@chicoreus) migrated from spreadsheet: @AC: Variable name is fine. The other validation variable names need to change. We must specify all of them as positive, not negative.

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: Whatever we do we need to be consistent

iDigBioBot avatar Jan 12 '18 16:01 iDigBioBot

img_20180118_150932

godfoder avatar Jan 18 '18 20:01 godfoder

We haven't addressed the point from @ArthurChapman that the authorship needs to be included, as scientificNameAuthorship may (incorrectly) differ from the authorship parsed out of scientificName.

chicoreus avatar Jun 24 '20 20:06 chicoreus

This test shares a name with #45 and #46, but this test looks for consistency in the parts of the name in their various darwin core fields, while the other two tests currently only compare scientificName with a source authority.

chicoreus avatar Jun 24 '20 20:06 chicoreus

This is another one that was originally called (pre-Gainesville) "TG2-VALIDATION_SCIENTIFICNAME_INCONSISTENT". I can't see my discussion on including Authorship @chicoreus - I believe Authorship may complicate things (as the many different spellings and inconsistencies) - I am thinking that is maybe why we changed the naming of these three to POLYNOMIAL from SCIENTIFICNAME - i.e. to basically exclude authorship in the Scientific Name.

ArthurChapman avatar Jun 25 '20 01:06 ArthurChapman

I believe that is correct, we wanted to distinguish explicitly in the name of the test that the authorship was not included.

On Wed, Jun 24, 2020 at 10:15 PM Arthur Chapman [email protected] wrote:

This is another one that was originally called (pre-Gainesville) "TG2-VALIDATION_SCIENTIFICNAME_INCONSISTENT". I can't see my discussion on including Authorship @chicoreus https://github.com/chicoreus - I believe Authorship may complicate things (as the many different spellings and inconsistencies) - I am thinking that is maybe why we changed the naming of these three to POLYNOMIAL from SCIENTIFICNAME - i.e. to basically exclude authorship in the Scientific Name.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tdwg/bdq/issues/101#issuecomment-649156138, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADQ725DWYXVTIYLNNZHDVDRYKQLVANCNFSM4EKSRXMA .

tucotuco avatar Jun 25 '20 15:06 tucotuco

Trying to look at a test dataset for this test

At present we say "INTERNAL_PREREQUISITES_NOT_MET if all of the component terms are EMPTY"

but surely INTERNAL_PREREQUISITES_NOT_MET if dwc:scientificName is EMPTY and/or if dwc:genus is EMPTY.

dwc:infraspecificEpithet or specificEpithet on their own are not sufficient to be able to compare Scientific Name against genus, species, infraspecies.

If you have a scientificName and a genus (but no specificEpithet or infraspecificEpithet) then you can still compare

ArthurChapman avatar Oct 06 '20 01:10 ArthurChapman

As a followup from my last comment - you may like to look at the DRAFT test data file I have created on my interpretation

https://github.com/tdwg/bdq/blob/master/tg2/core/testdata/testdata_POLYNOMIAL_INCOSISTENT_%23101.csv

ArthurChapman avatar Oct 06 '20 01:10 ArthurChapman

Looking at #82 SCIENTIFICNAME_EMPTY overlaps with this one. If one was using a Workflow and #82 was run first and failed, then#101 would not need to be run. We seem to have a little redundancy here, but not sure how to fix. I see no problem in having both.

ArthurChapman avatar Oct 06 '20 01:10 ArthurChapman

@ArthurChapman see the description of the logic in the notes. dwc:genus can be empty and dwc:specificEpithet can still be checked against dwc:scientificName for consistency.

See note in #82 these tests are along different axies in the framework, and test order is not specified, so some overlap is expected, especially in complex sets of interrelated terms like these.

chicoreus avatar Oct 06 '20 02:10 chicoreus

@chicoreus OK - but that still means that INTERNAL_PREREQUISITES_NOT_MET if dwc:scientificName is EMPTY or if all of dwc:genus, dwc:specificEpithet or dwc:infraspecificEpithet are empty

OK - I didn't read the notes - I will change my tests data file to concur with the notes (once this discussion is finished) Interesting though, if we say that the test is COMPLIANT if you have a scientific name with Aus Bus Cus and the genus is empty and the species is empty but you have Cus in the infraspecific epithet. Would not logic say that it is NOT_COMPLIANT because the genus and species aren't compliant with the scientificName because they don't have values.

ArthurChapman avatar Oct 06 '20 02:10 ArthurChapman

In the light of recent discussions, I have added the specific dwc terms to the Expected Response.

Tasilee avatar Feb 10 '22 21:02 Tasilee

Examining test data, the following would return NOT_COMPLIANT when I think it should be INTERNAL_PREREQUISITES_NOT_MET

dwc:scientificName="", dwc:genus="Hakea", dwc:specificEpithet="decurrens", dwc:infraspecificEpithet="physocarpa"

??

Tasilee avatar Mar 03 '22 22:03 Tasilee

Agreed.

tucotuco avatar Mar 03 '22 23:03 tucotuco

OK, so could we have a taxon guru adapt the Expected response? These epithet things scare me. Names scare me.

Tasilee avatar Mar 03 '22 23:03 Tasilee

Agreed - probably needs rewording

INTERNAL_PREREQUISITES_NOT_MET if dwc:scientificName, and all of dwc:genus, dwc:specificEpithet and dwc:infraspecificEpithet are EMPTY; COMPLIANT if the polynomial, as represented in dwc:scientificName, is consistent with the atomic parts dwc:genus, dwc:specificEpithet, dwc:infraspecificEpithet; otherwise NOT_COMPLIANT

ArthurChapman avatar Mar 03 '22 23:03 ArthurChapman

Hmm, maybe

INTERNAL_PREREQUISITES_NOT_MET if dwc:scientificName is EMPTY, or all of dwc:genus, dwc:specificEpithet and dwc:infraspecificEpithet are EMPTY; COMPLIANT if the polynomial, as represented in dwc:scientificName, is consistent with the atomic parts dwc:genus, dwc:specificEpithet, dwc:infraspecificEpithet; otherwise NOT_COMPLIANT

Tasilee avatar Mar 03 '22 23:03 Tasilee

+1 to what @Tasilee said.

tucotuco avatar Mar 04 '22 00:03 tucotuco

1 + @tucotuco is a majority :) CHANGED

Tasilee avatar Mar 04 '22 00:03 Tasilee

Changed dwc:genus to dwc:genericName throughout this test in line with recent changes to Darwin Core.

ArthurChapman avatar Mar 06 '22 20:03 ArthurChapman

As noted by @tucotuco the acceptance of https://dwc.tdwg.org/terms/#dwc:genericName resolves the potential ambiguity of dwc:genus with it's definition as the generic placement in the taxonomy from dwc:genericName as a parse of the first word of the scientific name.

chicoreus avatar Mar 06 '22 20:03 chicoreus

In the Expected Response ..."COMPLIANT if the polynomial, as represented in dwc:scientificName, is consistent with the atomic parts dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet" do we need the words "with the atomic parts"

Would not:

"COMPLIANT if the polynomial, as represented in dwc:scientificName, is consistent with dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet"

be sufficient

ArthurChapman avatar Mar 24 '22 21:03 ArthurChapman

Are all happy with the specifications on this one now?

Tasilee avatar Apr 03 '22 00:04 Tasilee

Getting this 'on the record' for all to consider: Email with @chicoreus yesterday. I suggested for the Expected Response-

INTERNAL_PREREQUISITES_NOT_MET if dwc:scientificName is EMPTY, or all of dwc:genericName, dwc:specificEpithet and dwc:infraspecificEpithet are EMPTY; COMPLIANT if the polynomial, as represented in dwc:scientificName, is consistent with NOT_EMPTY values of dwc:genericName, dwc:specificEpithet, dwc:infraspecificEpithet; otherwise NOT_COMPLIANT.

@chicoreus response: "That is more explicit that the current separate (and not formalized yet) general guidance on handling "consistent". But if we we are explicit in this way here, we may need to be in other tests invoking "consistent"."

Thoughts?

Tasilee avatar Sep 12 '22 22:09 Tasilee

I like it, but not sure of other implications

ArthurChapman avatar Sep 12 '22 22:09 ArthurChapman

After discussion on the Zoom today, we agreed that using the current Test Data format for examples would seem expedient. We also previously agreed that a "COMPLIANT" and "NOT_COMPLIANT" or equivalents was appropriate.

I think the examples of INTERNAL/EXTERNAL_PREREQUISITES_NOT_MET would be overkill here?

What I have added in Examples is a for a check on formatting.

Tasilee avatar Nov 07 '22 02:11 Tasilee

I don't see how the test can accommodate for interpolated names part of a polynomial dwc:scientificName. Polynomials with interpolated names: Aus (Bus) cus, where Bus is a subgenus Aus (cus) dus, where cus is a superspecies

Archilegt avatar Nov 07 '22 09:11 Archilegt