xarray-schema
xarray-schema copied to clipboard
Schema validation for Xarray objects
xarray-schema
Schema validation for Xarray
installation
Install xarray-schema from PyPI:
pip install xarray-schema
Conda:
conda install -c conda-forge xarray-schema
Or install it from source:
pip install git+https://github.com/xarray-contrib/xarray-schema
usage
Xarray-schema's API is modeled after Pandera. The DataArraySchema
and DatasetSchema
objects both have .validate()
methods.
The basic usage is as follows:
import numpy as np
import xarray as xr
from xarray_schema import DataArraySchema, DatasetSchema, CoordsSchema
da = xr.DataArray(np.ones(4, dtype='i4'), dims=['x'], name='foo')
schema = DataArraySchema(dtype=np.integer, name='foo', shape=(4, ), dims=['x'])
schema.validate(da)
You can also use it to validate a Dataset
like so:
schema_ds = DatasetSchema({'foo': schema})
schema_ds.validate(da.to_dataset())
Each component of the Xarray data model is implemented as a stand alone class:
from xarray_schema.components import (
DTypeSchema,
DimsSchema,
ShapeSchema,
NameSchema,
ChunksSchema,
ArrayTypeSchema,
AttrSchema,
AttrsSchema
)
# example constructions
dtype_schema = DTypeSchema('i4')
dims_schema = DimsSchema(('x', 'y', None)) # None is used as a wildcard
shape_schema = ShapeSchema((5, 10, None)) # None is used as a wildcard
name_schema = NameSchema('foo')
chunk_schema = ChunksSchema({'x': None, 'y': -1}) # None is used as a wildcard, -1 is used as
ArrayTypeSchema = ArrayTypeSchema(np.ndarray)
# Example usage
dtype_schema.validate(da.dtype)
# Each object schema can be exported to JSON format
dtype_json = dtype_schema.to_json()
roadmap
This is a very early prototype of a library. Some key things are missing:
- Exceptions: Pandera accumulates schema exceptions and reports them all at once. Currently, we are a eagerly raising
SchemaErrors
when the are found.
license
All the code in this repository is MIT licensed.
history
This project was originally developed at CarbonPlan. It was transferred to the xarray-contrib organization in August 2022.