cubiql
cubiql copied to clipboard
CSO data compatibility
I was trying to use CubiQL with CSO endpoint:
java -jar graphql-qb-0.2.1-SNAPSHOT-standalone.jar --port 9000 --endpoint http://data.cso.ie/sparql
The following error is displayed:
@zeginis Could you please advise how complicated it would be to make the data work with the API?
@arekstasiewicz there requirements for data to work with CubiQL are documented at: #92.
Most of these will be fixed. There are 2 open issues:
- Always use the qb:measureType even if there is only one measure. We need to discuss this.
- A qb:codeList should be defined for each dimension of the cube that contains only the values used in the cube. Most probably we will adopt this approach.
As a first step you can add (qb:codeList) the codelist that are defined at CSO to the dimensions. I have checked them and all the values they contain are used at every cube, so they are compliant with the above requirement (The only exception is the age dimension)
Hi @zeginis I am attaching a sample cube I created to align with CubiQL cso_noENtag.ttl, but still throwing error at creation time. I have addressed the following data restriction issues: 1- qb:codeList 2- qb:measureType 3- Language tag 4- multiple publishers
what is it still missing?
error:
exception in thread "main" clojure.lang.ExceptionInfo: Call to #'com.walmartlabs.lacinia.schema/compile did not conform to spec: In: [0 :objects :dataset_cso 1 :description] val: #grafter.rdf.protocols.LangString{:string "CSO", :lang :en} fails spec: :com.walmartlabs.lacinia.schema/description at: [:args :schema :objects 1 :description] predicate: string? {:clojure.spec.alpha/problems ({:path [:args :schema :objects 1 :description], :pred clojure.core/string?, :val #grafter.rdf.protocols.LangString{:string "CSO", :lang :en}, :via [:com.walmartlabs.lacinia.schema/schema-object :com.walmartlabs.lacinia.schema/schema-object :com.walmartlabs.lacinia.schema/objects :com.walmartlabs.lacinia.schema/object :com.walmartlabs.lacinia.schema/description], :in [0 :objects :dataset_cso 1 :description]}), :clojure.spec.alpha/spec #object[clojure.spec.alpha$regex_spec_impl$reify__2436 0x35329a05 "clojure.spec.alpha$regex_spec_impl$reify__2436@35329a05"], :clojure.spec.alpha/value ({:objects {:ref_area {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the reference area"}, :label {:type String, :description "Label for the reference area"}}}, :dataset_cso_observations {:fields {:sparql {:type String, :description "SPARQL query used to retrieve matching observations.", :resolve #object[graphql_qb.resolvers$resolve_observations_sparql_query 0x6f1d799 "graphql_qb.resolvers$resolve_observations_sparql_query@6f1d799"]}, :page {:type :dataset_cso_observations_page, :args {:after {:type :SparqlCursor}, :first {:type Int}}, :description "Page of results to retrieve.", :resolve #object[graphql_qb.schema$wrap_observations_mapping$fn__5325 0xa120b9 "graphql_qb.schema$wrap_observations_mapping$fn__5325@a120b9"]}, :total_matches {:type Int}, :aggregations {:type :dataset_cso_observations_aggregations}}}, :ref_period {:fields {:uri {:type :uri, :description "URI of the reference period"}, :label {:type String, :description "Label for the reference period"}, :start {:type :DateTime, :description "Start time for the period"}, :end {:type :DateTime, :description "End time for the period"}}}, :dim {:fields {:uri {:type :uri, :description "URI of the dimension"}, :values {:type (list :dim_value), :description "Code list of values for the dimension"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :dataset_cso_observations_page {:fields {:next_page {:type :SparqlCursor, :description "Cursor to the next page of results"}, :count {:type Int}, :observations {:type (list :dataset_cso_observations_page_observations), :description "List of observations on this page"}}}, :unmapped_dim_value {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the dimension value"}, :label {:type String, :description "Label for the dimension value"}}}, :measure {:fields {:uri {:type :uri, :description "URI of the measure"}, :label {:type String, :description "Label for the measure"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :enum_dim_value {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the dimension value"}, :label {:type String, :description "Label for the dimension value"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :dataset_cso {:implements [:dataset_meta], :fields {:description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :publisher {:type :uri, :description "URI of the publisher of the dataset"}, :observations {:type :dataset_cso_observations, :args {:dimensions {:type :dataset_cso_observations_dimensions}, :order {:type (list :dataset_cso_dimension_measures)}, :order_spec {:type :dataset_cso_observations_order_spec}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x21a9f95b "graphql_qb.schema$argument_mapping_resolver$fn__5311@21a9f95b"]}, :modified {:type :DateTime, :description "When the dataset was last modified"}, :dimensions {:type (list :dim), :resolve #object[graphql_qb.schema$get_query_schema_model$fn__5359 0x69069866 "graphql_qb.schema$get_query_schema_model$fn__5359@69069866"], :description "Dimensions within the dataset"}, :title {:type String, :description "Dataset title"}, :licence {:type :uri, :description "URI of the licence the dataset is published under"}, :measures {:type (list :measure), :description "Measure types within the dataset"}, :issued {:type :DateTime, :description "When the dataset was issued"}, :uri {:type :uri, :description "Dataset URI"}}, :description #grafter.rdf.protocols.LangString{:string "CSO", :lang :en}}, :dataset_cso_observations_aggregations {:fields {:max {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0xac417a2 "graphql_qb.schema$argument_mapping_resolver$fn__5311@ac417a2"]}, :min {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x64c95480 "graphql_qb.schema$argument_mapping_resolver$fn__5311@64c95480"]}, :sum {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x69499c6f "graphql_qb.schema$argument_mapping_resolver$fn__5311@69499c6f"]}, :average {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x3451fc88 "graphql_qb.schema$argument_mapping_resolver$fn__5311@3451fc88"]}}}, :dataset {:implements [:dataset_meta], :fields {:description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :publisher {:type :uri, :description "URI of the publisher of the dataset"}, :modified {:type :DateTime, :description "When the dataset was last modified"}, :dimensions {:type (list :dim), :resolve #object[graphql_qb.resolvers$dataset_dimensions_resolver$fn__5213 0x3041beb3 "graphql_qb.resolvers$dataset_dimensions_resolver$fn__5213@3041beb3"], :description "Dimensions within the dataset"}, :title {:type String, :description "Dataset title"}, :licence {:type :uri, :description "URI of the licence the dataset is published under"}, :measures {:type (list :measure), :resolve #object[graphql_qb.resolvers$dataset_measures_resolver$fn__5205 0x2e40fdbd "graphql_qb.resolvers$dataset_measures_resolver$fn__5205@2e40fdbd"], :description "Measure types within the dataset"}, :issued {:type :DateTime, :description "When the dataset was issued"}, :uri {:type :uri, :description "Dataset URI"}}}, :dataset_cso_observations_page_observations {:fields {:uri {:type :uri}, :value {:type String}}}}, :interfaces {:dataset_meta {:description "Fields common to generic and specific dataset schemas", :fields {:uri {:type :uri, :description "Dataset URI"}, :title {:type String, :description "Dataset title"}, :description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :dimensions {:type (list :dim), :description "Dimensions within the dataset"}, :measures {:type (list :measure), :description "Measure types within the dataset"}}}, :resource {:description "Resource with a URI and optional label", :fields {:uri {:type :uri, :description "URI of the resource"}, :label {:type String, :description "Optional label"}}}}, :enums {:sort_direction {:description "Which direction to sort a dimension or measure in", :values [:ASC :DESC]}}, :unions {:dim_value {:members [:enum_dim_value :unmapped_dim_value]}}, :input-objects {:filter {:fields {:or {:type (list :uri), :description "List of URIs for which at least one must be contained within matching datasets."}, :and {:type (list :uri), :description "List of URIs which must all be contained within matching datasets."}}}, :ref_period_filter {:fields {:uri {:type :uri, :description "URI of the reference period"}, :starts_before {:type :DateTime, :description "Latest start time for the reference period"}, :starts_after {:type :DateTime, :description "Earliest start time for the reference period"}, :ends_before {:type :DateTime, :description "Latest end time for the reference period"}, :ends_after {:type :DateTime, :description "Earliest end time for the reference period"}}}, :page_selector {:fields {:first {:type Int, :description "Number of results to retrive."}, :after {:type :SparqlCursor, :description "Cursor to the start of the results page"}}}, :dataset_cso_observations_dimensions {:fields {}}, :dataset_cso_observations_order_spec {:fields {:value {:type :sort_direction}}}}, :queries {:datasets {:type (list :dataset), :resolve #object[graphql_qb.resolvers$resolve_datasets 0x19647566 "graphql_qb.resolvers$resolve_datasets@19647566"], :args {:dimensions {:type :filter}, :uri {:type :uri}}}, :dataset_cso {:type :dataset_cso, :resolve #object[graphql_qb.resolvers$wrap_post_resolver$fn__5140 0x527d48db "graphql_qb.resolvers$wrap_post_resolver$fn__5140@527d48db"]}}, :scalars {:SparqlCursor {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x2335aef2 "clojure.spec.alpha$spec_impl$reify__1987@2335aef2"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x17003497 "clojure.spec.alpha$spec_impl$reify__1987@17003497"]}, :uri {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x2f038d3c "clojure.spec.alpha$spec_impl$reify__1987@2f038d3c"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x376498da "clojure.spec.alpha$spec_impl$reify__1987@376498da"]}, :DateTime {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x39a8e2fa "clojure.spec.alpha$spec_impl$reify__1987@39a8e2fa"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x2f9addd4 "clojure.spec.alpha$spec_impl$reify__1987@2f9addd4"]}}}), :clojure.spec.alpha/args ({:objects {:ref_area {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the reference area"}, :label {:type String, :description "Label for the reference area"}}}, :dataset_cso_observations {:fields {:sparql {:type String, :description "SPARQL query used to retrieve matching observations.", :resolve #object[graphql_qb.resolvers$resolve_observations_sparql_query 0x6f1d799 "graphql_qb.resolvers$resolve_observations_sparql_query@6f1d799"]}, :page {:type :dataset_cso_observations_page, :args {:after {:type :SparqlCursor}, :first {:type Int}}, :description "Page of results to retrieve.", :resolve #object[graphql_qb.schema$wrap_observations_mapping$fn__5325 0xa120b9 "graphql_qb.schema$wrap_observations_mapping$fn__5325@a120b9"]}, :total_matches {:type Int}, :aggregations {:type :dataset_cso_observations_aggregations}}}, :ref_period {:fields {:uri {:type :uri, :description "URI of the reference period"}, :label {:type String, :description "Label for the reference period"}, :start {:type :DateTime, :description "Start time for the period"}, :end {:type :DateTime, :description "End time for the period"}}}, :dim {:fields {:uri {:type :uri, :description "URI of the dimension"}, :values {:type (list :dim_value), :description "Code list of values for the dimension"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :dataset_cso_observations_page {:fields {:next_page {:type :SparqlCursor, :description "Cursor to the next page of results"}, :count {:type Int}, :observations {:type (list :dataset_cso_observations_page_observations), :description "List of observations on this page"}}}, :unmapped_dim_value {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the dimension value"}, :label {:type String, :description "Label for the dimension value"}}}, :measure {:fields {:uri {:type :uri, :description "URI of the measure"}, :label {:type String, :description "Label for the measure"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :enum_dim_value {:implements [:resource], :fields {:uri {:type :uri, :description "URI of the dimension value"}, :label {:type String, :description "Label for the dimension value"}, :enum_name {:type String, :description "Name of the corresponding enum value"}}}, :dataset_cso {:implements [:dataset_meta], :fields {:description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :publisher {:type :uri, :description "URI of the publisher of the dataset"}, :observations {:type :dataset_cso_observations, :args {:dimensions {:type :dataset_cso_observations_dimensions}, :order {:type (list :dataset_cso_dimension_measures)}, :order_spec {:type :dataset_cso_observations_order_spec}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x21a9f95b "graphql_qb.schema$argument_mapping_resolver$fn__5311@21a9f95b"]}, :modified {:type :DateTime, :description "When the dataset was last modified"}, :dimensions {:type (list :dim), :resolve #object[graphql_qb.schema$get_query_schema_model$fn__5359 0x69069866 "graphql_qb.schema$get_query_schema_model$fn__5359@69069866"], :description "Dimensions within the dataset"}, :title {:type String, :description "Dataset title"}, :licence {:type :uri, :description "URI of the licence the dataset is published under"}, :measures {:type (list :measure), :description "Measure types within the dataset"}, :issued {:type :DateTime, :description "When the dataset was issued"}, :uri {:type :uri, :description "Dataset URI"}}, :description #grafter.rdf.protocols.LangString{:string "CSO", :lang :en}}, :dataset_cso_observations_aggregations {:fields {:max {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0xac417a2 "graphql_qb.schema$argument_mapping_resolver$fn__5311@ac417a2"]}, :min {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x64c95480 "graphql_qb.schema$argument_mapping_resolver$fn__5311@64c95480"]}, :sum {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x69499c6f "graphql_qb.schema$argument_mapping_resolver$fn__5311@69499c6f"]}, :average {:type Float, :args {:measure {:type (non-null :dataset_cso_aggregation_measures), :description "The measure to aggregate"}}, :resolve #object[graphql_qb.schema$argument_mapping_resolver$fn__5311 0x3451fc88 "graphql_qb.schema$argument_mapping_resolver$fn__5311@3451fc88"]}}}, :dataset {:implements [:dataset_meta], :fields {:description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :publisher {:type :uri, :description "URI of the publisher of the dataset"}, :modified {:type :DateTime, :description "When the dataset was last modified"}, :dimensions {:type (list :dim), :resolve #object[graphql_qb.resolvers$dataset_dimensions_resolver$fn__5213 0x3041beb3 "graphql_qb.resolvers$dataset_dimensions_resolver$fn__5213@3041beb3"], :description "Dimensions within the dataset"}, :title {:type String, :description "Dataset title"}, :licence {:type :uri, :description "URI of the licence the dataset is published under"}, :measures {:type (list :measure), :resolve #object[graphql_qb.resolvers$dataset_measures_resolver$fn__5205 0x2e40fdbd "graphql_qb.resolvers$dataset_measures_resolver$fn__5205@2e40fdbd"], :description "Measure types within the dataset"}, :issued {:type :DateTime, :description "When the dataset was issued"}, :uri {:type :uri, :description "Dataset URI"}}}, :dataset_cso_observations_page_observations {:fields {:uri {:type :uri}, :value {:type String}}}}, :interfaces {:dataset_meta {:description "Fields common to generic and specific dataset schemas", :fields {:uri {:type :uri, :description "Dataset URI"}, :title {:type String, :description "Dataset title"}, :description {:type String, :description "Dataset description"}, :schema {:type String, :description "Name of the GraphQL query root field corresponding to this dataset"}, :dimensions {:type (list :dim), :description "Dimensions within the dataset"}, :measures {:type (list :measure), :description "Measure types within the dataset"}}}, :resource {:description "Resource with a URI and optional label", :fields {:uri {:type :uri, :description "URI of the resource"}, :label {:type String, :description "Optional label"}}}}, :enums {:sort_direction {:description "Which direction to sort a dimension or measure in", :values [:ASC :DESC]}}, :unions {:dim_value {:members [:enum_dim_value :unmapped_dim_value]}}, :input-objects {:filter {:fields {:or {:type (list :uri), :description "List of URIs for which at least one must be contained within matching datasets."}, :and {:type (list :uri), :description "List of URIs which must all be contained within matching datasets."}}}, :ref_period_filter {:fields {:uri {:type :uri, :description "URI of the reference period"}, :starts_before {:type :DateTime, :description "Latest start time for the reference period"}, :starts_after {:type :DateTime, :description "Earliest start time for the reference period"}, :ends_before {:type :DateTime, :description "Latest end time for the reference period"}, :ends_after {:type :DateTime, :description "Earliest end time for the reference period"}}}, :page_selector {:fields {:first {:type Int, :description "Number of results to retrive."}, :after {:type :SparqlCursor, :description "Cursor to the start of the results page"}}}, :dataset_cso_observations_dimensions {:fields {}}, :dataset_cso_observations_order_spec {:fields {:value {:type :sort_direction}}}}, :queries {:datasets {:type (list :dataset), :resolve #object[graphql_qb.resolvers$resolve_datasets 0x19647566 "graphql_qb.resolvers$resolve_datasets@19647566"], :args {:dimensions {:type :filter}, :uri {:type :uri}}}, :dataset_cso {:type :dataset_cso, :resolve #object[graphql_qb.resolvers$wrap_post_resolver$fn__5140 0x527d48db "graphql_qb.resolvers$wrap_post_resolver$fn__5140@527d48db"]}}, :scalars {:SparqlCursor {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x2335aef2 "clojure.spec.alpha$spec_impl$reify__1987@2335aef2"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x17003497 "clojure.spec.alpha$spec_impl$reify__1987@17003497"]}, :uri {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x2f038d3c "clojure.spec.alpha$spec_impl$reify__1987@2f038d3c"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x376498da "clojure.spec.alpha$spec_impl$reify__1987@376498da"]}, :DateTime {:parse #object[clojure.spec.alpha$spec_impl$reify__1987 0x39a8e2fa "clojure.spec.alpha$spec_impl$reify__1987@39a8e2fa"], :serialize #object[clojure.spec.alpha$spec_impl$reify__1987 0x2f9addd4 "clojure.spec.alpha$spec_impl$reify__1987@2f9addd4"]}}}), :clojure.spec.alpha/failure :instrument, :clojure.spec.test.alpha/caller {:file "core.clj", :line 157, :var-scope graphql-qb.core/build-schema-context}}
#grafter.rdf.protocols.LangString{:string "CSO", :lang :en}
@mohadelrezk this line says that there is an en
language tag at the label "CSO".
I checked at the data but the "CSO" label does not have a language tag. Are you using just the file you send as input or something more?
The qb:codeList
of the cube dimensions should be a skos:ConceptScheme that includes all the URIs that are used as values of the dimension. At the file you send I see you use: qb:codeList "<http://purl.org/linked-data/sdmx/2009/subject#>"
Additionaly it is preferable to use URIs instead of string for the values of the dimensions e.g. use http://reference.data.gov.uk/id/year/2016 instead of "2016"^^xsd:string
ogi:observations_009097c4-45ee-40d2-b405-b82de3963ab7 a qb:Observation ;
qb:dataSet ogi:cso_ds ;
qb:measureType ogi:Value ;
ogi:CensusYear "2016"^^xsd:string ;
ogi:Nationality "Not stated, including no nationality"^^xsd:string ;
ogi:Sex "Male"^^xsd:string ;
ogi:SingleYearofAge "63 years"^^xsd:string ;
ogi:Statistic "Population Usually Resident and Present in the State 2011 to 2016 (Number)"^^xsd:string ;
ogi:Value "268"^^xsd:string .