iris
                                
                                
                                
                                    iris copied to clipboard
                            
                            
                            
                        Cube.aggregated_by() fails when derived coordinates are present
Hi, I've encountered the following bug(?) while working with files that contain derived coordinates:
import iris
import numpy as np
path = iris.sample_data_path('hybrid_height.nc')
cube = iris.load(path)[0]
aux_coord = iris.coords.AuxCoord(np.arange(cube.shape[1]) % 7,
                                 long_name='random aux coord')
cube.add_aux_coord(aux_coord, 1)
cube.aggregated_by('random aux coord', iris.analysis.MEAN)  # fails
fails with
Traceback (most recent call last):
  File "bug_derived_coord.py", line 9, in <module>
    cube.aggregated_by('random aux coord', iris.analysis.MEAN)
  File "miniconda3/envs/test/lib/python3.7/site-packages/iris/cube.py", line 3506, in aggregated_by
    self.coord_dims(coord))
  File "miniconda3/envs/test/lib/python3.7/site-packages/iris/cube.py", line 965, in add_aux_coord
    raise ValueError('Duplicate coordinates are not permitted.')
ValueError: Duplicate coordinates are not permitted.
If the derived coordinate and the coordinate used for aggregation do not share a common dimension, it works:
import iris
import numpy as np
path = iris.sample_data_path('hybrid_height.nc')
cube = iris.load(path)[0]
cube = iris.util.new_axis(cube)
cube.add_aux_coord(iris.coords.AuxCoord(0, long_name='random aux coord'), 0)
cube = cube.aggregated_by('random aux coord', iris.analysis.MEAN)  # does not fail
If I remove the derived coordinate before the aggregation, it doesn't fail, too:
import iris
import numpy as np
path = iris.sample_data_path('hybrid_height.nc')
cube = iris.load(path)[0]
aux_coord = iris.coords.AuxCoord(np.arange(cube.shape[1]) % 7,
                                 long_name='random aux coord')
cube.add_aux_coord(aux_coord, 1)
for aux_factory in cube.aux_factories:
    cube.remove_aux_factory(aux_factory)
cube = cube.aggregated_by('random aux coord', iris.analysis.MEAN)  # does not fail
print(cube)
correctly prints
air_potential_temperature / (K)     (model_level_number: 15; -- : 7; grid_longitude: 100)
     Dimension coordinates:
          model_level_number                           x        -                  -
          grid_longitude                               -        -                  x
     Auxiliary coordinates:
          atmosphere_hybrid_height_coordinate          x        -                  -
          sigma                                        x        -                  -
          grid_latitude                                -        x                  -
          random aux coord                             -        x                  -
          surface_altitude                             -        x                  x
     Scalar coordinates:
          forecast_period: 0.0 hours
          forecast_reference_time: 2009-09-09 17:10:00
          time: 2009-09-09 17:10:00
     Attributes:
          Conventions: CF-1.5
          STASH: m01s00i004
          source: Data from Met Office Unified Model 7.04
     Cell methods:
          mean: unknown
Thus, regular multidimensional coordinates that share dimensions with the aggregated coordinate (in this case surface_altitude) are also not a problem.
If this this is not a bug, a more precise error message would be nice. Thanks for your help!
Hi, thanks for highlighting this.
I believe this is a bug. From what I can tell, this is ultimately caused by iris failing to remove the derived coordinate at this line: https://github.com/SciTools/iris/blob/e54e2ffce114d4f6575dbcf9c21557b3aeff61c7/lib/iris/cube.py#L4135-L4136 When the coordinate is added again later, this causes the error you are seeing.
I also noticed that that Cube.remove_coord() does not work for derived coordinates, e.g.
cube.remove_coord('altitude')
cube.remove_coord('altitude')
print(cube)
with the cube above does not fail and still prints the derived coordinate. This is probably related.
@schlunma I can confirm that allowing Cube.remove_coord() to remove derived coords does in fact fix the problem, though it's worth noting that the derived coordinates will then become regular coordinates in the process.
Although we are accepting #3641, which avoids the error here, I'm not convinced that the new effect of that is actually a correct resolution of this problem.. (see this comment on the #3641 fix)
I suspect that aggregated_by should be handling derived coordinates more carefully.
At present (since #3641), it will result in a concrete, aggregated version of the original.
I suspect it would be better if that was (a) correctly derived from the aggregated depdendencies, or (b) discarded.
But, it's not immediately clear (to me!) what is possible or desirable here..
Closed by #4947