bdq
bdq copied to clipboard
TG2-VALIDATION_EVENTDATE_STANDARD
TestField | Value |
---|---|
GUID | 4f2bf8fd-fc5c-493f-a44c-e7b16153c803 |
Label | VALIDATION_EVENTDATE_STANDARD |
Description | Is the value of dwc:eventDate a valid ISO date? |
TestType | Validation |
Darwin Core Class | dwc:Event |
Information Elements ActedUpon | dwc:eventDate |
Information Elements Consulted | |
Expected Response | INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is bdq:Empty; COMPLIANT if the value of dwc:eventDate is a valid ISO 8601 date; otherwise NOT_COMPLIANT |
Data Quality Dimension | Conformance |
Term-Actions | EVENTDATE_STANDARD |
Parameter(s) | |
Source Authority | |
Specification Last Updated | 2024-09-16 |
Examples | [dwc:eventDate="1963-03-08T14": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:eventDate contains a valid ISO 8601-1:2019 date"] |
[dwc:eventDate="1963-03-08T14:67-0600": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:eventDate does not contain a valid ISO 8601-1:2019 date"] | |
Source | Paul Morris |
References |
|
Example Implementations (Mechanisms) | FilteredPush/Kurator:event_date_qc 10.5281/zenodo.596795 |
Link to Specification Source Code | event_date_qc DwCEventDQ.validationEventdateStandard() |
Notes | This test should also pick up issues such as 29 Feb in a non leap year. |
Comment by Paul Morris (@chicoreus) migrated from spreadsheet: There are similar tests in this sheet for date modified and date identified, but this test on eventDate appears to be missing
Need clarification (perhaps from the standard (which we can't get at)) as to whether this should be a date which exists as well as a correctly formatted date (e.g. should February 29 not in a leap year fail).
The event_date_qc library contains a test which references the guid f413594a-df57-41ea-a187-b8c6c6379b45 and VALIDATION_EVENT_DATE_EXISTS, with a specification: "Compliant if dwc:eventDate can to parsed as an actual ISO date, otherwise not compliant. Internal prerequisites not met if dwc:eventDate is empty" This may be an implementation of an earlier form of VALIDATION_EVENTDATE_NOTSTANDARD from the spreadsheet. If we decide that VALIDATION_EVENTDATE_NOTSTANDARD includes testing whether the date existed or not, then the event_date_qc implementation would take the standard name and guid.
We should note that correctly formatted dates in the ISO format include dates, date ranges, date ranges spanning a month, date ranges spanning a year, dates and date ranges specified with year-dayofyear, and dates and date ranges including time. The ISO date format also allows dates to be described as a recurring interval. Single dates, and date ranges in various forms are common in biodiversity data, dates with times and dates specified by year and day of year are less common.
As we have #36, we have valid dates covered? This one is "not standard" which I would assume to be not parseable as ISO. #36 covers "not possible". I guess that between all the test responses, the user should know that state of play. Any other comments? Does it need more work?
@Tasilee #36 doesn't cover validity (possible or not possible) of dates - depending on the implementation, a date within the allowed range could be a date which does not exist. The parallel issue #69 specifies "The value of dwc:dateIdentified is not a valid ISO 8601:2004(E) date", different from the specification here. We should have similar specifications for NOT_STANDARD.
@chicoreus I agree with you in regards a concise and consistent wording for NOT STANDARD. Are you suggesting "The value of dwc:eventDate is not correctly formatted to ISO 8601:2004(E) date." should be "The value of dwc:eventDate is not a valid ISO 8601:2004(E) date"? If so, go for it.
I would say #36 SHOULD cover a 'not possible' date within a valid time range.
Not sure @Tasilee if we want to make #36 more complicated. We seem to have enough problems now with trying to fit it into a range.
Has anyone written code that can parse an ISO date range to it's constituent parts? Any language, but VBA might be good for individual taxonomists using Access for their projects. There's a link here to parsing ISO dates (not ranges) in VBA. Also Javascript here
@ianengelbrecht see the event_date_qc https://github.com/FilteredPush/event_date_qc library, written in java. It works with almost all forms of ISO date ranges and can parse a wide variety of verbatim date values into ISO date ranges. For some examples, see the unit tests: here's an example in a low level verbatim event date parser test https://github.com/FilteredPush/event_date_qc/blob/7dd53bc59823f3d16ea18954d0be2e36118bded1/src/test/java/org/filteredpush/qc/date/DateUtilsTest.java#L804 and here's a case in a higher level test of a test with results framed in the concepts of the fittness for use framework: https://github.com/FilteredPush/event_date_qc/blob/7dd53bc59823f3d16ea18954d0be2e36118bded1/src/test/java/org/filteredpush/qc/date/DwcEventDQTest.java#L1298
As @ArthurChapman has raised under #76, do we have agreement that we would generally
- Refer directly to format issue standards such as ISO 8601 rather than bdq:sourceAuthority (which we would use more for vocab lookup situations) and
- Remove "bdq:sourceAuthority is ... : link" from the Source Authority specification? We would of course retain that link info in the References.
The alternative is to always use "bdq:sourceAuthority" in the Expected Responses.
Once we have agreement either way, I will check through all the issues to ensure consistency.
I would be happier having bdq:sourceAuthority used for vocabularies.
I know @arthur agrees with that. @chicoreus ??
I have changed
INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; COMPLIANT if the value of dwc:eventDate is valid according to bdq:sourceAuthority; otherwise NOT_COMPLIANT
to
INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; COMPLIANT if the value of dwc:eventDate is a valid ISO 8601-1:2019 date; otherwise NOT_COMPLIANT
and removed the reference to bdq:sourceAuthority
Minor issue which some have likely noted: In the Expected Response, do we use "ISO 8601-1:2019" or just an abbreviated form such as "ISO 8601". In this case, the Reference isn't the former, so should it be?
I've edited the Expected Response according to @tucotuco suggestion:
From
INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; COMPLIANT if the value of dwc:eventDate is a valid ISO 8601-1:2019 date; otherwise NOT_COMPLIANT
to
INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; COMPLIANT if the value of dwc:eventDate is a valid ISO 8601-1 date; otherwise NOT_COMPLIANT
and updated the References
I have updated the ISO Reference link
Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".
Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"