earthkit-data
earthkit-data copied to clipboard
Implement tensor object
A tensor object would represent multidimensional labelled data and provide coordinate based data access and slicing.
The current scope is rather limited and only want to replicate some xarray functionality.
Proposed features:
- convert a fieldlist into a tensor using
to_tensor()
- the users have to specify the metadata keys to form the tensor on
- these will define the
coords
- at the moment the
coords
are extended with the following additional coords:-
latitude
,longitude
for regular grids in lat and lon -
x
,y
for other regular grids -
values
for irregular grids
-
- the tensor can only be formed if all the fields have the same grid and for each metadata combination there is exactly one field in the fieldlist. No holes allowed
- no concept of a variable or dimension as in xarray
- slicing methods: bracket,
sel()
,isel()
- lat-lon access:
latitudes
,longitudes()
- no computation methods
- creating a object with
copy(data=my_data)
(see the notebook example)
Questions:
- allow attaching attributes?
- how to use it in an easy way in computations? E.g. computing the average along a given dimension
- in a fieldlist the equivalent of
coords
are calledindices
# 3 params on 6 pressure levels
>>> ds = from_source("file", "tuv_pl.grib")
>>> t = ds.to_tensor("param", "level")
>>> t.coords.keys
dict_keys(['param', 'level', 'latitude', 'longitude'])
>>> t.coords
Coordinates:
param [str] t, u, v
level [int] 300, 400, 500, 700, 850, 1000
latitude [float64] 90.0, 60.0, 30.0, 0.0, -30.0, -60.0, -90.0
longitude [float64] 0.0, 30.0, 60.0, 90.0, 120.0, 150.0, 180.0, 210.0, 240.0 ,..., 330.0
>>> t.shape
(3, 6, 7, 12)
>>> t.to_numpy().shape
(3, 6, 7, 12)
# slicing
>>> r = t[1:3,0]
>>> r.shape
(2, 1, 7, 12)
For more details see the example: https://earthkit-data.readthedocs.io/en/feature-tensor/examples/grib_cube.html