gml_application_schema_toolbox icon indicating copy to clipboard operation
gml_application_schema_toolbox copied to clipboard

[GMLAS] instance-driven parsing

Open mhugo opened this issue 8 years ago • 7 comments

GDAL GMLAS consider om:result as unparsed XML data. They are stored as a TEXT field ...

https://github.com/BRGM/gml_application_schema_toolbox/blob/master/samples/BRGM_raw_database_observation_waterml2_output.xml https://github.com/BRGM/gml_application_schema_toolbox/blob/master/samples/BRGM_raw_database_observation_PointTimeseriesObservation_output.xml

@rouault normal or not ?

mhugo avatar Jul 12 '17 08:07 mhugo

@mhugo is that the same with the most recent flows ?

  • waterml2.0: http://ressource.brgm-rec.fr/obs/RawOfferingPiezo/06512X0037/STREMY.2&responseFormat=http://www.opengis.net/waterml/2.0&temporalFilter=om%3AphenomenonTime%2Clatest
  • INSPIRE:PointTimeSeries: http://ressource.brgm-rec.fr/obs/RawOfferingPiezo/06512X0037/STREMY.2&responseFormat=http://www.opengis.net/waterml/2.0&temporalFilter=om%3AphenomenonTime%2Clatest

sgrellet avatar Jul 12 '17 08:07 sgrellet

@mhugo

Yes this is expected. The om:result element of om:OM_Observation is typed as xs:any...

This can be overcome by editing the <TypingConstraints> in the gmlasconf.xml to add a new constraint for om:OM_Observation/om:result

    <!-- constraints typically expressed as schematrons -->
    <TypingConstraints>
        <Namespaces>
            <Namespace prefix="gwml2w" uri="http://www.opengis.net/gwml-well/2.2"/>
            <Namespace prefix="om" uri="http://www.opengis.net/om/2.0"/>
            <Namespace prefix="wml2" uri="http://www.opengis.net/waterml/2.0"/>
        </Namespaces>

        <ChildConstraint>
            <ContainerXPath>gwml2w:GW_GeologyLog/om:result</ContainerXPath>
            <ChildrenElements>
                <Element>gwml2w:GW_GeologyLogCoverage</Element>
            </ChildrenElements>
        </ChildConstraint>

        <ChildConstraint>
            <ContainerXPath>om:OM_Observation/om:result</ContainerXPath>
            <ChildrenElements>
                <Element>wml2:MeasurementTimeseries</Element>
            </ChildrenElements>
        </ChildConstraint>
    </TypingConstraints>

but the scope of this constraint is too broad to be applied in general. Makes only sense if you know you're dealing with WML2 documents

rouault avatar Jul 12 '17 08:07 rouault

Ok ... thanks. Does it mean also that without this configuration, contents of om:result are not checked against their schema ?

mhugo avatar Jul 12 '17 08:07 mhugo

Ok ... thanks. Does it mean also that without this configuration, contents of om:result are not checked against their schema ?

Yes, the OGR schema is established primarily from the analysis of the .xsd. Here, that would involve doing a pass on the document as well to realize that the om:result might hold a wml2:MeasurementTimeseries

rouault avatar Jul 12 '17 08:07 rouault

@rouault : we need to do what you propose. In the first exercise (ProofOfConcept) Hugo, extracted each timevaluePair. This allows to plug other tools on the resulting DB. That's really important for domain users. I guess we reach the same situation as with GeologyLogCoverage -> GMLAS driver needs extra info (WaterML2 .sch ?) to know what to do in that situation (given the xs:any)

sgrellet avatar Jul 12 '17 09:07 sgrellet

@rouault what if we have different streams with the same xpath and different types ? could we have more than one type in ChildrenElements ?

mhugo avatar Jul 27 '17 12:07 mhugo

could we have more than one type in ChildrenElements ?

@hugo Yes, several <Element> can be put inside a <ChildrenElements>

rouault avatar Jul 27 '17 15:07 rouault