xBOUT icon indicating copy to clipboard operation
xBOUT copied to clipboard

Applying boutcore operations

Open TomNicholas opened this issue 6 years ago • 7 comments

It may be possible to apply the BoutCore compiled C++ operators which David Schworer wrote to xBOUT datasets using xr.apply_ufunc().

This could be very powerful as then we could have a set of methods (perhaps accessed as da.boutcore.method()?) which apply the BoutCore operations directly.

This should definitely be @dschwoerer's job, because I have no idea how BoutCore works!

TomNicholas avatar Dec 08 '18 16:12 TomNicholas

I don't know how to use apply_ufunc How do I ensure that I get always full 3D data sets? Is there a way to get several fields?

Generally it works:

import xbout
import boutcore as bc
import numpy as np

ds=xbout.open_boutdataset("delta_1/BOUT.dmp.*.nc")
n10=ds["n"].isel(t=10)

bc.init("-d delta_1 -f BOUT.settings -o BOUT.post -l POST.log output=false output:enabled=false")
f=bc.Field3D.fromMesh();

for t in range(ds.dims['t']):
    f.setAll(ds["n"].isel(t=t))
    print(np.max(bc.DDX(f).getAll()))

dschwoerer avatar Dec 12 '19 15:12 dschwoerer

I think a first step into nicer integration, would be adding a wrapper of some sort, that lets you easily get a Field3D from the dataset.

What would be a nice interface?

ds.boutcore['n'][10] to get the t=10 density field as a boutcore field?

dschwoerer avatar Dec 12 '19 15:12 dschwoerer

So a boutcore Field3D is basically a single variable? So you could aim for something like:

from xbout import open_boutdataset
from xbout import boutcore

ds = open_boutdataset(datapath=..., inputfilepath=...)

# Choose a single variable, so it corresponds to a single Field3D
n = ds['n']

# Access the boutcore methods through a (short) accessor
dndx = n.bc.DDX()

# Result would be an xarray DataArray which has has the boutcore operation applied

You could do this with a new accessor:

from xarray import register_dataset_accessor
import boutcore as bc

@register_dataarray_accessor('bc')
class BoutCoreAccessor:
    def __init__(self, da):
        super().__init__(da)

        # Initialise boutcore (get the filepaths from the da.attrs)
        bc.init("-d delta_1 -f BOUT.settings -o BOUT.post -l POST.log output=false output:enabled=false")
        
        # Create the Field3D
        f = bc.Field3D.fromMesh()
        f.setAll(da)
        self.f = f

    def DDX(self):
        result = bc.DDX(self.f).getAll()
        return xr.DataArray(result)  # or similar using apply_ufunc

This way you just do any slicing on the xarray object before giving it to boutcore:

dndx = n.isel(t=10).bc.DDX()

Accessor methods can take multiple arguments so you could have n.bc.advect(phi, B)

(It's also possible to set up the accessor so that just n.DDX is valid syntax)

An alternative (functional) syntax could just be

dndx = bc.DDX(n)

which is basically what you already have but we could convert the xarray dataarray -> boutcore -> dataarray.

TomNicholas avatar Dec 12 '19 16:12 TomNicholas

Sorry I forgot you would need to initialise boutcore - I updated my example with that

TomNicholas avatar Dec 12 '19 16:12 TomNicholas

How would that imply for the advect example? would that return advect(n,phi,B)? I think that is a bit confusing, and I like to keep it close to the C++ interface, so copy-pasting of C++ code to python works with minimal changes. Because I want to do in python exactly what I did in C++ - I assume that is the standard use case.

Your init doesn't work as is. init must be called ONCE. After that you need to create a mesh with Mesh.fromOptions() - but that is sufficiently easy to work around.

dschwoerer avatar Dec 12 '19 16:12 dschwoerer

would that return advect(n,phi,B)? I think that is a bit confusing

Fair enough - that was just one option. I was only trying to demonstrate that you could have an accessor method called whatever you like that accepts however many arguments you like.

Your init doesn't work as is. init must be called ONCE.

Once for each set of data files, or once ever? If it's the former you could do something like this

from xbout import open_boutdataset
from xbout.boutcore import boutcoreinit
import boutcore as bc

ds = open_boutdataset(datapath=..., inputfilepath=...)


def boutcoreinit(ds, datapath=, inputfilepath=, ...)
    """This calls bc.init, and stores bc in the ds.attrs (and in every da.attrs too)"""
    bc.init(...)

    ds.attrs['boutcore'] = bc
    for var in ds:
        ds[var].attrs['boutcore'] = bc


init_boutcore(ds, inputfilepath=...)  

Now you can access the pointer to the same bc object whenever the accessor is used

@register_dataarray_accessor('bc')
class BoutCoreAccessor:
    def __init__(self, da):
        super().__init__(da)

        # Find the initialised boutcore instance for this set of data
        self.bc = da.attrs['boutcore']
        
        # Create the Field3D
        f = bc.Field3D.fromMesh()
        f.setAll(da)
        self.f = f

    def DDX(self):
        result = bc.DDX(self.f).getAll()
        return xr.DataArray(result)  # or similar using apply_ufunc

TomNicholas avatar Dec 12 '19 16:12 TomNicholas

Once for each set of data files, or once ever?

Once per run. MPI needs to be initialised exactly once, and only finalised at the end (I think this is done automatically). I guess the most forward way is to have a global variable, or ask boutcore (not sure there is an interface). Then we need to create a mesh for this dataset, and store it, and always use this for creating fields.

What I am less clear about is slicing - the fields musn't be sliced in x, y or z (otherwise the mesh needs to be ajusted). Slicing in time isn't an issue, and if it isn't done, it needs to be calculated one time slice at a time ...

From the example above, should a field be created for each time slice? I think that would interfere with xbout's lazy loading - but I guess that will not work anyway?

dschwoerer avatar Jun 19 '20 14:06 dschwoerer