xmca icon indicating copy to clipboard operation
xmca copied to clipboard

Does xMCA work on gridded data too?

Open gkb999 opened this issue 2 years ago • 4 comments

Hi, My dataset looks like:

xarray.Dataset
Dimensions:
lon: 171lat: 128time: 3285bnds: 2
Coordinates:
lon (lon) float64 -5.344e+05 -5.281e+05 ... 5.281e+05
lat (lat) float64 -1.959e+06 ... -1.166e+06
time (time) datetime64[ns] 2013-01-01 ... 2021-12-31

Lat and Lon in 'meters' (below) image

When I applied EOFs, on this, it however, did not fail. But, the result looked strange. Please check the eof[0] pattern... image

the eof[0] lokks like:

xarray.DataArray'EOFs'lon: 171mode: 6
array([[-1.76054379e-22, -3.28724969e-19, -1.11229007e-16,
        -1.27920154e-17, -2.97548278e-16,  1.83119557e-16],
       [-6.15561168e-21,  3.36886857e-18, -8.68632332e-17,
        -1.61311084e-16, -2.85747382e-17, -1.29165757e-16],
       [-1.66259177e-20,  1.44649751e-17, -4.54359525e-16,
        -2.18833121e-16, -3.33146482e-16,  2.95736462e-16],
       ...,
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
        -0.00000000e+00, -0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
        -0.00000000e+00, -0.00000000e+00,  0.00000000e+00],
       [ 0.00000000e+00,  0.00000000e+00,  0.00000000e+00,
        -0.00000000e+00, -0.00000000e+00,  0.00000000e+00]])
Coordinates:
lon
(lon)
float64
-5.344e+05 -5.281e+05 ... 5.281e+05
array([-534375., -528125., -521875., -515625., -509375., -503125., -496875.,
       -490625., -484375., -478125., -471875., -465625., -459375., -453125.,
       -446875., -440625., -434375., -428125., -421875., -415625., -409375.,
       -403125., -396875., -390625., -384375., -378125., -371875., -365625.,
       -359375., -353125., -346875., -340625., -334375., -328125., -321875.,
       -315625., -309375., -303125., -296875., -290625., -284375., -278125.,
       -271875., -265625., -259375., -253125., -246875., -240625., -234375.,
       -228125., -221875., -215625., -209375., -203125., -196875., -190625.,
       -184375., -178125., -171875., -165625., -159375., -153125., -146875.,
       -140625., -134375., -128125., -121875., -115625., -109375., -103125.,
        -96875.,  -90625.,  -84375.,  -78125.,  -71875.,  -65625.,  -59375.,
        -53125.,  -46875.,  -40625.,  -34375.,  -28125.,  -21875.,  -15625.,
         -9375.,   -3125.,    3125.,    9375.,   15625.,   21875.,   28125.,
         34375.,   40625.,   46875.,   53125.,   59375.,   65625.,   71875.,
         78125.,   84375.,   90625.,   96875.,  103125.,  109375.,  115625.,
        121875.,  128125.,  134375.,  140625.,  146875.,  153125.,  159375.,
        165625.,  171875.,  178125.,  184375.,  190625.,  196875.,  203125.,
        209375.,  215625.,  221875.,  228125.,  234375.,  240625.,  246875.,
        253125.,  259375.,  265625.,  271875.,  278125.,  284375.,  290625.,
        296875.,  303125.,  309375.,  315625.,  321875.,  328125.,  334375.,
        340625.,  346875.,  353125.,  359375.,  365625.,  371875.,  378125.,
        384375.,  390625.,  396875.,  403125.,  409375.,  415625.,  421875.,
        428125.,  434375.,  440625.,  446875.,  453125.,  459375.,  465625.,
        471875.,  478125.,  484375.,  490625.,  496875.,  503125.,  509375.,
        515625.,  521875.,  528125.])
lat () float64 -1.959e+06
standard_name : projection_y_coordinate
long_name : y coordinate of projection
units : metre
axis : Y
array( -1959375.)
mode (mode) int32 1 2 3 4 5 6
array([1, 2, 3, 4, 5, 6])
Indexes:
lon PandasIndex
PandasIndex(Float64Index([-534375.0, -528125.0, -521875.0, -515625.0, -509375.0, -503125.0,
              -496875.0, -490625.0, -484375.0, -478125.0,              ...
               471875.0,  478125.0,  484375.0,  490625.0,  496875.0,  503125.0,
               509375.0,  515625.0,  521875.0,  528125.0],
             dtype='float64', name='lon', length=171))
mode PandasIndex
PandasIndex(Int64Index([1, 2, 3, 4, 5, 6], dtype='int64', name='mode'))

Here, the lat() is empty. Why is it? Should I reproject the data, as xMCA may not support this format of data?

Any assistance on this would be helpful.

Thanks in advance.

gkb999 avatar May 22 '23 03:05 gkb999

Just to be sure, xMCA only works with xr.DataArray (in xeofs Dataset will be allowed soon), so be sure to convert your Dataset before the analysis.

Also , please ensure that the dimensions are time, lon and lat and remove any other dimension/coordinate i.e. dimensions bnds.

Finally, have you applied latitude correction via .apply_coslat? Unfortunately, this won't work in the current version as it assumes latitudes in degrees to compute the weights. Since your latitudes are in km, the weights will be false and thus the final result.

nicrie avatar May 22 '23 08:05 nicrie

Just to be sure, xMCA only works with xr.DataArray (in xeofs Dataset will be allowed soon), so be sure to convert your Dataset before the analysis.

I converted to data array, so this shouldn't be a problem.

Also , please ensure that the dimensions are time, lon and lat and remove any other dimension/coordinate i.e. dimensions bnds. Well, I did not drop the extra bands. I will do that and try Finally, have you applied latitude correction via .apply_coslat? Unfortunately, this won't work in the current version as it assumes latitudes in degrees to compute the weights. Since your latitudes are in km, the weights will be false and thus the final result.

I'm pretty sure this is the major problem. As my data array is in meters, do you know any packages/links where in I can apply latitude correction?

Thanks a lot for providing these insights. Always helpful. :)

gkb999 avatar May 22 '23 08:05 gkb999

I must admit I'm not sure of a package that can automatically apply this kind of weighting for you, but that's not to say it doesn't exist!

However, I'm a bit skeptical that simply providing coordinates in kilometers would enable accurate calculation of weights. What you'd actually need is a more detailed understanding of the projection used to represent the data. Climate data is typically portrayed on a rectangular lon/lat grid, which, while convenient and familiar to us as observers, does distort the area represented by a grid point towards the poles, making it appear larger.

The coslat correction is a handy workaround for this specific distortion, but keep in mind that this effect arises from the classic (PlateCarree) projection that's commonly used.

Given that you're working with a different projection, you should be careful about determining the necessary correction, if any is needed at all. It might be unavoidable to delve deeper into your data set to understand the projection used and how it may distort the data.

That said, you can certainly carry out the analysis without any corrections. However, you should be cognizant that in such a case, your results are likely to reflect EOFs of what could be described as "inflated" areas.

nicrie avatar May 22 '23 09:05 nicrie

you can certainly carry out the analysis without any corrections.

Many Many Thanks for the detailed explanation @nicrie, I'm so glad you're here to provide detailed information on these. I'm working with this projection (https://nsidc.org/data/user-resources/help-center/guide-nsidcs-polar-stereographic-projection), units are in meters. And also, the data is in anti-meridian zone and hence the classic projections aren't well applied using raster-xarray packages.

I can now get the root cause and will work on it and get back if the re-projection and coslat correction works and I'm able to do EOFs using either xMCA and xeofs. Meanwhile, please get back if by any chance you accidentally come across some information on this.

Thanks a ton again :)

gkb999 avatar May 22 '23 22:05 gkb999