iris icon indicating copy to clipboard operation
iris copied to clipboard

Cube.aggregated_by() fails when derived coordinates are present

Open schlunma opened this issue 5 years ago • 4 comments

Hi, I've encountered the following bug(?) while working with files that contain derived coordinates:

import iris
import numpy as np
path = iris.sample_data_path('hybrid_height.nc')
cube = iris.load(path)[0]
aux_coord = iris.coords.AuxCoord(np.arange(cube.shape[1]) % 7,
                                 long_name='random aux coord')
cube.add_aux_coord(aux_coord, 1)
cube.aggregated_by('random aux coord', iris.analysis.MEAN)  # fails

fails with

Traceback (most recent call last):
  File "bug_derived_coord.py", line 9, in <module>
    cube.aggregated_by('random aux coord', iris.analysis.MEAN)
  File "miniconda3/envs/test/lib/python3.7/site-packages/iris/cube.py", line 3506, in aggregated_by
    self.coord_dims(coord))
  File "miniconda3/envs/test/lib/python3.7/site-packages/iris/cube.py", line 965, in add_aux_coord
    raise ValueError('Duplicate coordinates are not permitted.')
ValueError: Duplicate coordinates are not permitted.

If the derived coordinate and the coordinate used for aggregation do not share a common dimension, it works:

import iris
import numpy as np
path = iris.sample_data_path('hybrid_height.nc')
cube = iris.load(path)[0]
cube = iris.util.new_axis(cube)
cube.add_aux_coord(iris.coords.AuxCoord(0, long_name='random aux coord'), 0)
cube = cube.aggregated_by('random aux coord', iris.analysis.MEAN)  # does not fail

If I remove the derived coordinate before the aggregation, it doesn't fail, too:

import iris
import numpy as np
path = iris.sample_data_path('hybrid_height.nc')
cube = iris.load(path)[0]
aux_coord = iris.coords.AuxCoord(np.arange(cube.shape[1]) % 7,
                                 long_name='random aux coord')
cube.add_aux_coord(aux_coord, 1)
for aux_factory in cube.aux_factories:
    cube.remove_aux_factory(aux_factory)
cube = cube.aggregated_by('random aux coord', iris.analysis.MEAN)  # does not fail
print(cube)

correctly prints

air_potential_temperature / (K)     (model_level_number: 15; -- : 7; grid_longitude: 100)
     Dimension coordinates:
          model_level_number                           x        -                  -
          grid_longitude                               -        -                  x
     Auxiliary coordinates:
          atmosphere_hybrid_height_coordinate          x        -                  -
          sigma                                        x        -                  -
          grid_latitude                                -        x                  -
          random aux coord                             -        x                  -
          surface_altitude                             -        x                  x
     Scalar coordinates:
          forecast_period: 0.0 hours
          forecast_reference_time: 2009-09-09 17:10:00
          time: 2009-09-09 17:10:00
     Attributes:
          Conventions: CF-1.5
          STASH: m01s00i004
          source: Data from Met Office Unified Model 7.04
     Cell methods:
          mean: unknown

Thus, regular multidimensional coordinates that share dimensions with the aggregated coordinate (in this case surface_altitude) are also not a problem.

If this this is not a bug, a more precise error message would be nice. Thanks for your help!

schlunma avatar Jan 14 '20 19:01 schlunma

Hi, thanks for highlighting this.

I believe this is a bug. From what I can tell, this is ultimately caused by iris failing to remove the derived coordinate at this line: https://github.com/SciTools/iris/blob/e54e2ffce114d4f6575dbcf9c21557b3aeff61c7/lib/iris/cube.py#L4135-L4136 When the coordinate is added again later, this causes the error you are seeing.

stephenworsley avatar Jan 15 '20 15:01 stephenworsley

I also noticed that that Cube.remove_coord() does not work for derived coordinates, e.g.

cube.remove_coord('altitude')
cube.remove_coord('altitude')
print(cube)

with the cube above does not fail and still prints the derived coordinate. This is probably related.

schlunma avatar Jan 15 '20 15:01 schlunma

@schlunma I can confirm that allowing Cube.remove_coord() to remove derived coords does in fact fix the problem, though it's worth noting that the derived coordinates will then become regular coordinates in the process.

stephenworsley avatar Jan 16 '20 11:01 stephenworsley

Although we are accepting #3641, which avoids the error here, I'm not convinced that the new effect of that is actually a correct resolution of this problem.. (see this comment on the #3641 fix)

I suspect that aggregated_by should be handling derived coordinates more carefully. At present (since #3641), it will result in a concrete, aggregated version of the original. I suspect it would be better if that was (a) correctly derived from the aggregated depdendencies, or (b) discarded.

But, it's not immediately clear (to me!) what is possible or desirable here..

pp-mo avatar Jun 15 '20 10:06 pp-mo

Closed by #4947

bjlittle avatar Nov 09 '22 07:11 bjlittle