fhir_models
fhir_models copied to clipboard
FI-202: Refactor Validation
FHIR Resource Validation
Overview
The FHIR::Validation
module provides FHIR Resource Validation capabilities which are intended to improve maintainability and extensibility.
See FI-202
Background
The current state of validation is complicated and costly to maintain. Currently there are two validation implementations in fhir_models
.
-
FHIR::Model
contains an implementation that largely relies on the classMETADATA
constant and definesvalid?
,validate
. -
FHIR::StructureDefinition
contains an implementation that relies on the underlying StructureDefintion and definesvalidate_resource
andvalidates_resource?
They each have their own implementation of how things like cardinality adherence are validated.
Cardinality Implementations
unless count >= meta['min'] && count <= meta['max']
errors[field] << "#{meta['path']}: invalid cardinality. Found #{count} expected #{meta['min']}..#{meta['max']}"
end
FHIR::StructureDefinition#validate_resource
def verify_cardinality(element, nodes)
# Check the cardinality
min = element.min
max = element.max == '*' ? Float::INFINITY : element.max.to_i
@errors << "#{describe_element(element)} failed cardinality test (#{min}..#{max}) -- found #{nodes.size}" if (nodes.size < min) || (nodes.size > max)
end
Some validation checks are only in one and not the other. For example, the maxLength is validated in the FHIR::StructureDefinition
implementation, but not in the FHIR::Model
implementation.
Each implementation also returns results in different ways, for example if we were to validate a Patient resource:
patient = FHIR::Patient.new(telecom: [{system: 'foo', value: ['bar' , 'baz'], period: [{start:[1,2]}, {start: '2019'}, {start:[3,4]}]}, {system: 'corge', code: 'qux'}])
patient.validate
Returns
{"telecom"=>
[{"system"=>["ContactPoint.system: invalid codes [\"foo\"]"],
"value"=>["ContactPoint.value: invalid cardinality. Found 2 expected 0..1"],
"period"=>
["ContactPoint.period: invalid cardinality. Found 2 expected 0..1",
{"start"=>
["Period.start: invalid cardinality. Found 2 expected 0..1",
"Period.start: 1 does not match dateTime regex",
"Period.start: 2 does not match dateTime regex"]},
{"start"=>
["Period.start: invalid cardinality. Found 2 expected 0..1",
"Period.start: 3 does not match dateTime regex",
"Period.start: 4 does not match dateTime regex"]}]},
{"system"=>["ContactPoint.system: invalid codes [\"corge\"]"]}]}
The results are organized into a hash whose structure resembles that of a Patient Resource.
It is not immediately obvious which element in the resource contains the error in a collection of elements (In this example the first and third Period.start
have cardinality issues). The arrays of results in the hash are also heterogeneous, containing both strings for errors and hashes for the sub elements. Traversing to a nested error requires requires searching through the array and navigating down multiple subtrees. For example, there are two nested hashes in the array which contain errors for the Period.start elements.
The resource can also be validated against the StructureDefinition directly.
patient_structure_def = FHIR::Definitions.resource_definition('Patient')
patient_structure_def.validate_resource(patient)
This actually returns no errors or warnings, which could be misleading. This validator only the elements in the differential which are:
"Patient",
"Patient.identifier",
"Patient.active",
"Patient.name",
"Patient.telecom",
"Patient.gender",
"Patient.birthDate",
"Patient.deceased[x]",
"Patient.address",
"Patient.maritalStatus",
"Patient.multipleBirth[x]",
"Patient.photo",
"Patient.contact",
"Patient.contact.relationship",
"Patient.contact.name",
"Patient.contact.telecom",
"Patient.contact.address",
"Patient.contact.gender",
"Patient.contact.organization",
"Patient.contact.period",
"Patient.communication",
"Patient.communication.language",
"Patient.communication.preferred",
"Patient.generalPractitioner",
"Patient.managingOrganization",
"Patient.link",
"Patient.link.other",
"Patient.link.type"
Updating the patient with errors in these elements and then validating the patient returns:
patient = FHIR::Patient.new(telecom: [{system: 'foo', value: ['bar' , 'baz'], period: [{start:[1,2]}, {start: '2019'}, {start:[3,4]}]}, {system: 'corge', code: 'qux'}])
patient_structure_def.validate_resource(patient)
["Patient.gender has invalid code 'attack helicopter' from http://hl7.org/fhir/ValueSet/administrative-gender",
"Patient.gender did not match one of the valid data types: [\"code\"]",
"Patient.contact: FluentPath expression evaluates to false for Patient invariant rule pat-1: SHALL at least contain a contact's details or a reference to an organization",
"{\"gender\"=>\"fighter jet\"}",
"Patient.contact.gender has invalid code 'fighter jet' from http://hl7.org/fhir/ValueSet/administrative-gender",
"Patient.contact.gender did not match one of the valid data types: [\"code\"]"]
These errors are presented as strings in a flat array, even for nested elements like Patient.contact.gender. . Additionally, some errors are not detected, such as the invalid cardinality on Patient.telecom[0].value, Patient.telecom.period which are detected in the StructureDefinition validation.
Warnings are not directly returned, but can be retrieved from the StructureDefinition.
patient_structure_def.warnings
Refactoring Validation
The FHIR::Validation attempts to unify the validation code from both the FHIR::Model
and FHIR::StructureDefintion
so that it is easier to maintain and update. The module has two "types" of validators: StructureValidators
and ElementValidators
.
The StructureValidator
contains a reference to a profile which contains a number of ElementDefinitions. The StructureValidator
constructs and traverses a hierarchy of thses ElementDefinitions to which it applies an ElementValidator
.
The ElementValidator implements a validate
method which validates the elements in the resource and returns an array of ValidationResults
which indicate the validation type and specific element path where the error occurred (e.g. Patient.telecom[0])
These validation results can then be more easily filtered.
results = patient.validate
# Get cardinality results
results.select {|res| res.validation_type == :cardinality}
# Get all failing tests
results.select {|res| res.result == :fail}
The validation results indicate successful and failing validation results, so that it is more precisely known what is checked so that it is not assumed that a resource "passed" when in reality something was just not checked.
The validate
method in FHIR::Model
and validate_resource
methods in FHIR::StructureDefinition
delegate validation to the validation module to return similar validation results. The FHIR::Models
validation defaults to using the StructureDefinition of the base resource.
A custom validator can also be constructed to use specific ElementValidators if only certain validations are desired, like cardinality or terminology.
Currently ElementValidators are provided for: cardinality, complex types, fixed values, max string length, and terminology.
Things Missing
This is not expected to be the end all for validation, but the first major refactor which allows updates to be added in a more consistent way and in a single place.
Additional validators for patterns, regular expression matching, contraints and others are needed. Some of these checks were partially implemented in one or more of the existing validators, but left until a more complete and consistent implementation is ready.
This PR is just huge, so I apologize if I missed it, but where are we validating the primitives? Is that not included any longer because this does not include the regex validation?
I was working on doing regex validation based off the StructureDefinition, but it was a little tricky because of some issues in the spec combined with the fact that we don't handle primitive extensions. It should be added back in, but holding onto this PR and making it even larger seemed unwise.
See: https://chat.fhir.org/#narrow/stream/179166-implementers/topic/extension-regex.20usage
Well, we still need to validate that Booleans are Booleans, and Strings are Strings, Dates and DateTimes are correct, and different types of numerics are correct. If you don't do that, then the validation could be crazy off --- regardless of extensions.
https://github.com/fhir-crucible/plan_executor/pull/174