oceanspy
oceanspy copied to clipboard
CKD tree losts in Arctic Crown grid
- OceanSpy version:0.2.0
Description
What I want to do now is that I want to know the derivative of some property at a point with known lat-lon-z on LLC grid. The conclusion of issue #225 is that it is rather hard to do derivative on complex topography without changing a lot of the code. To get around this, I did the transformation
gr = ospy.subsample.cutout(od,sampMethod = 'snapshot',timeFreq= None,timeRange = ['1994-11', '1994-12'],
varList = ['SALT','UVELMASS','VVELMASS'],transformation = "arctic_crown",centered = 'Atlantic')
using that od object I want to find the index of given lat-lon.
utree = gr.create_tree('U')
x, y, z = utils.spherical2cartesian(Y=10, X=10, R=6371.0)
distance,index1d = utree.query(
(x,y,z)
# this is (10N,10E)
)
ind = np.unravel_index(int(index1d), gr._ds['YU'].shape)
print('lon:%f, lat%f'%(gr._ds.XU[ind],gr._ds.YU[ind]))
print('distance(km):',distance)
which gives me: lon:10.000000, lat-11.438564 distance(km): 2369.977474747008 Which is not at all close to the original point. Simply change the 'U' to 'C' will give you the correct result for center point indexes, which means my using the code is not too wrong.
Also, I think this one is also not exactly a oceanspy issue. Because when I do the following:
R = gr.parameters["rSphere"]
X = gr._ds["XG"][:-1,:-1]
Y = gr._ds['YC']
# convert to cartesian
x, y, z = utils.spherical2cartesian(Y=Y.values, X=X.values, R=R)
x_stack = x.ravel()
y_stack = y.ravel()
z_stack = z.ravel()
# construct a tree using XG,YC
utree = spatial.cKDTree(np.column_stack((x_stack, y_stack, z_stack)))
distance,index1d = utree.query(
(6178.890843513511, 1089.5051665639178, 1106.312539916013)
# this is (10N,10E)
)
np.unravel_index(int(index1d), gr._ds['YC'].shape),distance
The result is also not correct. A very likely cause of this problem is that there are a bunch of nan in the arctic crown configuration, which might have complicated the matter for cKD tree. Even if we incorporated the refactoring in #224, since the underlying mechanics still cKD tree, the issue is still here.
As I said, we can either choose to transform the dataset or not. It would be nice to have a full set of functionality on (at least) one end. So we don't have to do this if we can work out the functionalities on the untransformed end, including the derivative functions.
Something we can try is this: we can temporarily replace the nans with coordinates that is far from the earth surface when creating the tree.
I am not sure if that will work though.
Simply mask the nans when creating the tree solved this issue. I have made some minor changes to the create_tree function in _oceandatset.py file. It has passed all the existing tests. I will commit it along with other changes.
Is this still a live issue? Or have you made a PR?
I haven't made the pull request yet. I actually forgot to back up the code, and it is deleted by sciserver. But fortunately, it's a really small change and I remember how to do it.
It is potentially representative of a whole class of issues or you can say it is the tip of an iceberg. The fear is that transformed datasets or any datasets that have a lot of NaNs (say a polygon-shaped dataset selected from the viewer) are going to pose a challenge to a lot of oceanspy functions. And since the transformation is a relatively new thing, we have zero tests on those datasets (there are some tests on creating such datasets in subsamples, but there is no guarantee that oceanspy functions are going to work on the cut-outs).
There is also a legitimate reason not to deal with this issue. One can argue that transformation and cut-out should be the last step of your computation. Although it would be nice to give users more freedom, whether it is worth the effort is debatable.
I do think it is necessary that we at least attempt to include those datasets in the tests. The best-case scenario is that it may not need as much effort as I expect.
@MaceKuailv Is the issue with NaN's solvable by replacing the NaNs with a number (say -10000)? This is how some models deal with NaNs.
OK, @MaceKuailv, make some cutout tests and then label this as a possible future enhancement.
@ThomasHaine, I can make a cutout dataset and modify existing tests of functions that use the tree (say subsample.mooring).
However, in order to merge those tests, I need to upload the new cutout dataset to the cloud (https://zenodo.org/record/5832607#.Ysai6ZYpDb0). This is because you don't want to rearrange the dataset in every test_...py file
OK, sounds like a good plan.