wrf-python icon indicating copy to clipboard operation
wrf-python copied to clipboard

Getvar funcion for a small region

Open fedecutraro opened this issue 7 years ago • 4 comments

Hello, I have a large wrfout and I need to process only a small region. Is there any way to call the getvar function only for that region?

Thanks

fedecutraro avatar Aug 16 '18 23:08 fedecutraro

Not currently, but in the future the front end will be rewritten to support xarray input (and an xarray extension) and this will be a lot easier to deal with. I don't know when this will happen though.

In the mean time, I might be able to come up with a workaround. WRF-Python supports using custom iterable classes to getvar that you can use to reduce your netcdf file sizes on the fly. Although this may not be particularly fast, if you're running out of memory, it should still be faster than crunching the entire data set. Below is a rough skeleton of what needs to be done. If I have time, I'll try to supply a real example that you can modify.

class FileReduce(object):
    def __init__(self, filenames, geobounds):
        self._filenames = filenames
        self._i = 0
        self._geobounds = geobounds

       # Use ll_to_xy to calculate the start_x, end_x, start_y, end_y locations
       ...

    def __iter__(self):
        return self
    
    def reduce(self, ncfileob, start_x, end_x, start_y, end_y)"
        # Create a new netcdf file object by slicing each original variable to the desired domain
        # Remember to update the dimensions and don't forget to add 1 element for the staggered vars
        ...

    def next(self):
        if self._i >= len(self._filenames):
            raise StopIteration
        else:
            f = netCDF4.Dataset(self._filenames[self._i])
            reduced_f = self.reduce(f, self._start_x, self._end_x, self._start_y, self._end_y)
            self._i += 1
            return reduced
    
    # Python 3
    def __next__(self):
        return self.next()

bladwig1 avatar Aug 17 '18 17:08 bladwig1

Below is an example that may or may not be useful for you. This FileReduce iterable object crops out a domain of interest and stores the files in a temporary directory (either specified or generated), and returns the cropped Dataset object. If the delete parameter is set to False, then you can use this to keep the cropped files. If delete is set to True, the temporary directory and files are deleted (so be careful not to set this to something you don't want deleted). If reuse is set to True, then you can use the already cropped files again without having to generate the cropped Dataset files again. Obviously, the xarray solution is much better and coming in the future, but for now, you might be able to do something with this.

import tempfile
import glob
import shutil
import os
from netCDF4 import Dataset
from wrf import getvar, ll_to_xy, CoordPair, GeoBounds


class FileReduce(object):
    def __init__(self, filenames, geobounds, tempdir=None, delete=True, reuse=False):
        """An iterable object for cutting out geographic domains.
        
        Args:
        
            filenames (sequence): A sequence of file paths to the WRF files
            
            geobounds (GeoBounds): A GeoBounds object defining the region of interest
            
            tempdir (str): The location to store the temporary cropped data files. If None, tempfile.mkdtemp is used.
            
            delete (bool): Set to True to delete the temporary directory when FileReduce is garbage collected.
            
            reuse (bool): Set to True when you want to resuse the files that were previously converted. *tempdir* 
                must be set to a specific directory that contains the converted files and *delete* must be False.
        
        """
        self._filenames = filenames
        self._i = 0
        self._geobounds = geobounds
        self._delete = delete
        self._cache = set()
        self._own_data = True
        self._reuse = reuse
        
        if tempdir is not None:
            if not os.path.exists(tempdir):
                os.makedirs(tempdir)
            self._tempdir = tempdir
            if self._reuse:
                self._cache = set((os.path.join(self._tempdir, name) 
                                   for name in os.listdir(self._tempdir)))
        else:
            self._tempdir = tempfile.mkdtemp()

        print ("temporary directory is: {}".format(self._tempdir))
        self._prev = None
        self._set_extents()
    
    def _set_extents(self):
        fname = list(self._filenames)[0]
        with Dataset(fname) as ncfile:
            lons = [self._geobounds.bottom_left.lon, self._geobounds.top_right.lon]
            lats = [self._geobounds.bottom_left.lat, self._geobounds.top_right.lat]
            orig_west_east = len(ncfile.dimensions["west_east"])
            orig_south_north = len(ncfile.dimensions["south_north"])
            
            
            # Note: Not handling the moving nest here
            # Extra points included around the boundaries to ensure domain is fully included
            x_y = ll_to_xy(ncfile, lats, lons, meta=False)
            self._start_x = 0 if x_y[0,0] == 0 else x_y[0,0] - 1
            self._end_x = orig_west_east - 1 if x_y[0,1] >= orig_west_east - 1 else x_y[0,1] + 1
            self._start_y = 0 if x_y[1,0] == 0 else x_y[1,0] - 1
            self._end_y = orig_south_north - 1 if x_y[1,1] >= orig_south_north - 1 else x_y[1,1] + 1
            
            self._west_east = self._end_x - self._start_x + 1
            self._west_east_stag = self._west_east + 1
            self._south_north = self._end_y - self._start_y + 1
            self._south_north_stag = self._south_north + 1
            
        
    def __iter__(self):
        return self
    
    def __copy__(self):
        cp = type(self).__new__(self.__class__)
        cp.__dict__.update(self.__dict__)
        cp._own_data = False
        cp._delete = False
        
        return cp
    
    def __del__(self):
        if self._delete:
            shutil.rmtree(self._tempdir)
    
    def reduce(self, fname):
        outfilename = os.path.join(self._tempdir, "reduced_" + os.path.basename(fname))
        
        # WRF-Python can iterate over sequences several times during a 'getvar', so a cache is used to 
        if outfilename in self._cache:
            return Dataset(outfilename)
        
        # New dimension sizes
        dim_d = {"west_east" : self._west_east,
                 "west_east_stag" : self._west_east_stag,
                 "south_north" : self._south_north,
                 "south_north_stag" : self._south_north_stag
                }
        
        # Data slice sizes for the 2D dimensions
        slice_d = {"west_east" : slice(self._start_x, self._end_x + 1),
                   "west_east_stag" : slice(self._start_x, self._end_x + 2),
                   "south_north" : slice(self._start_y, self._end_y + 1),
                   "south_north_stag" : slice(self._start_y, self._end_y + 2)
                  }
        
        with Dataset(fname) as infile, Dataset(outfilename, mode="w") as outfile:
            print ("reduce getting called!")
            
            # Copy the global attributes
            outfile.setncatts(infile.__dict__)

            # Copy Dimensions, limiting south_north and west_east to desired domain
            for name, dimension in infile.dimensions.items():
                dimsize = dim_d.get(name, len(dimension))
                outfile.createDimension(name, dimsize)

            # Copy Variables  
            for name, variable in infile.variables.iteritems():
                
                new_slices = tuple((slice_d.get(dimname, slice(None)) for dimname in variable.dimensions))

                outvar = outfile.createVariable(name, variable.datatype, variable.dimensions)

                outvar[:] = variable[new_slices]

                outvar.setncatts(variable.__dict__)
                
        
        result = Dataset(outfilename)
            
        self._cache.add(outfilename)
            
        return result
            
    
    def next(self):
        if self._i >= len(self._filenames):
            if self._prev is not None:
                self._prev.close()
            raise StopIteration
        else:
            fname = self._filenames[self._i]
            reduced_file = self.reduce(fname)
            if self._prev is not None:
                self._prev.close()
            self._prev = reduced_file
            
            self._i += 1
            
            return reduced_file
    
    # Python 3
    def __next__(self):
        return self.next()

# How to use with getvar
# Set lower left and upper right to your desired domain
ll = CoordPair(lat=24.0, lon=-87.)
ur = CoordPair(lat=27.0, lon=-84)
bounds = GeoBounds(ll, ur)
reduced_files = FileReduce(glob.glob("wrfout_d02*"),
                    bounds, tempdir="mytemp", delete=False, reuse=True)

slp = getvar(reduced_files, "slp")
print(slp)

bladwig1 avatar Aug 21 '18 21:08 bladwig1

Thank you Bill, I will try that

fedecutraro avatar Aug 21 '18 22:08 fedecutraro

I want to run WRF for 3 domains (one outer and 2 inners (not nested))

I want to put I and J coordinates in tslist file for getting the time-series of different location For converting longitude and latitude to coordinate I am using wrf.ll_to_xy() function from the wrf-python library. This requires a netCDF file, lat, long as the input, Previously When I was using only two domains (one inner and one outer) I was using the netCDF file of inner domain to get the I and j coordinate values. But now I am confused about which domain netCDF file I should use to get the I and J coordinates. outer D1? inner D2? inner D3?

beast252 avatar Nov 07 '22 02:11 beast252