xBOUT
xBOUT copied to clipboard
Applying boutcore operations
It may be possible to apply the BoutCore compiled C++ operators which David Schworer wrote to xBOUT datasets using xr.apply_ufunc().
This could be very powerful as then we could have a set of methods (perhaps accessed as da.boutcore.method()?) which apply the BoutCore operations directly.
This should definitely be @dschwoerer's job, because I have no idea how BoutCore works!
I don't know how to use apply_ufunc
How do I ensure that I get always full 3D data sets?
Is there a way to get several fields?
Generally it works:
import xbout
import boutcore as bc
import numpy as np
ds=xbout.open_boutdataset("delta_1/BOUT.dmp.*.nc")
n10=ds["n"].isel(t=10)
bc.init("-d delta_1 -f BOUT.settings -o BOUT.post -l POST.log output=false output:enabled=false")
f=bc.Field3D.fromMesh();
for t in range(ds.dims['t']):
f.setAll(ds["n"].isel(t=t))
print(np.max(bc.DDX(f).getAll()))
I think a first step into nicer integration, would be adding a wrapper of some sort, that lets you easily get a Field3D from the dataset.
What would be a nice interface?
ds.boutcore['n'][10] to get the t=10 density field as a boutcore field?
So a boutcore Field3D is basically a single variable? So you could aim for something like:
from xbout import open_boutdataset
from xbout import boutcore
ds = open_boutdataset(datapath=..., inputfilepath=...)
# Choose a single variable, so it corresponds to a single Field3D
n = ds['n']
# Access the boutcore methods through a (short) accessor
dndx = n.bc.DDX()
# Result would be an xarray DataArray which has has the boutcore operation applied
You could do this with a new accessor:
from xarray import register_dataset_accessor
import boutcore as bc
@register_dataarray_accessor('bc')
class BoutCoreAccessor:
def __init__(self, da):
super().__init__(da)
# Initialise boutcore (get the filepaths from the da.attrs)
bc.init("-d delta_1 -f BOUT.settings -o BOUT.post -l POST.log output=false output:enabled=false")
# Create the Field3D
f = bc.Field3D.fromMesh()
f.setAll(da)
self.f = f
def DDX(self):
result = bc.DDX(self.f).getAll()
return xr.DataArray(result) # or similar using apply_ufunc
This way you just do any slicing on the xarray object before giving it to boutcore:
dndx = n.isel(t=10).bc.DDX()
Accessor methods can take multiple arguments so you could have n.bc.advect(phi, B)
(It's also possible to set up the accessor so that just n.DDX is valid syntax)
An alternative (functional) syntax could just be
dndx = bc.DDX(n)
which is basically what you already have but we could convert the xarray dataarray -> boutcore -> dataarray.
Sorry I forgot you would need to initialise boutcore - I updated my example with that
How would that imply for the advect example? would that return advect(n,phi,B)? I think that is a bit confusing, and I like to keep it close to the C++ interface, so copy-pasting of C++ code to python works with minimal changes. Because I want to do in python exactly what I did in C++ - I assume that is the standard use case.
Your init doesn't work as is. init must be called ONCE. After that you need to create a mesh with Mesh.fromOptions() - but that is sufficiently easy to work around.
would that return advect(n,phi,B)? I think that is a bit confusing
Fair enough - that was just one option. I was only trying to demonstrate that you could have an accessor method called whatever you like that accepts however many arguments you like.
Your init doesn't work as is. init must be called ONCE.
Once for each set of data files, or once ever? If it's the former you could do something like this
from xbout import open_boutdataset
from xbout.boutcore import boutcoreinit
import boutcore as bc
ds = open_boutdataset(datapath=..., inputfilepath=...)
def boutcoreinit(ds, datapath=, inputfilepath=, ...)
"""This calls bc.init, and stores bc in the ds.attrs (and in every da.attrs too)"""
bc.init(...)
ds.attrs['boutcore'] = bc
for var in ds:
ds[var].attrs['boutcore'] = bc
init_boutcore(ds, inputfilepath=...)
Now you can access the pointer to the same bc object whenever the accessor is used
@register_dataarray_accessor('bc')
class BoutCoreAccessor:
def __init__(self, da):
super().__init__(da)
# Find the initialised boutcore instance for this set of data
self.bc = da.attrs['boutcore']
# Create the Field3D
f = bc.Field3D.fromMesh()
f.setAll(da)
self.f = f
def DDX(self):
result = bc.DDX(self.f).getAll()
return xr.DataArray(result) # or similar using apply_ufunc
Once for each set of data files, or once ever?
Once per run. MPI needs to be initialised exactly once, and only finalised at the end (I think this is done automatically). I guess the most forward way is to have a global variable, or ask boutcore (not sure there is an interface). Then we need to create a mesh for this dataset, and store it, and always use this for creating fields.
What I am less clear about is slicing - the fields musn't be sliced in x, y or z (otherwise the mesh needs to be ajusted). Slicing in time isn't an issue, and if it isn't done, it needs to be calculated one time slice at a time ...
From the example above, should a field be created for each time slice? I think that would interfere with xbout's lazy loading - but I guess that will not work anyway?