harmonica icon indicating copy to clipboard operation
harmonica copied to clipboard

Read Oasis Montaj© grd files

Open santisoler opened this issue 1 year ago • 7 comments

Create a new _io submodule and move the load_icgem_gdf function in there. Add new load_oasis_montaj_grid function to _io that reads Geosoft GRD files and returns an xarray.DataArray with grid values, its coordinates and header information.

santisoler avatar Aug 29 '22 19:08 santisoler

Nice! This will be very useful 😄

mdtanker avatar Aug 29 '22 20:08 mdtanker

@leouieda What would be a good way to test this function? We were discussing this in the last meeting and we think a good idea would be to have sample sample .grd files that we can pass to the function. I can make synthetic ones, but it would defeat the purpose.

santisoler avatar Sep 02 '22 21:09 santisoler

@leouieda What would be a good way to test this function? We were discussing this in the last meeting and we think a good idea would be to have sample sample .grd files that we can pass to the function. I can make synthetic ones, but it would defeat the purpose.

I know a coleague that works with Oasis Montaj (I think 6). Maybe I could ask him for an small sample grid. Which size should it be? I think that there is no problem with the version.

Esteban82 avatar Sep 02 '22 21:09 Esteban82

That would be great @Esteban82!

We would need small grids so they don't take too much space. Something around 50x50 elements would be good. Nevertheless, it would be nice if they have different number of points per direction, like: 49x51.

On top of that, it would be nice to have a grid with different data types:

  • doubles
  • floats
  • ints
  • unsigned ints

And with different features:

  • rotation
  • ordering of 1 and -1 (see https://help.seequent.com/Oasis-montaj/9.9/en/Content/ss/glossary/grid_file_format__grd.htm for more information on the ordering variable)
  • uncompressed grid and compressed grid (specially using zlib compression)

For the content of the grid, it can be any garbage. There's no need for it to be actual data. In fact, if it's synthetic data it's better (something like a chessboard or a gaussian well, etc).

The other requirement is that the grids should be under Creative Commons CC-BY license. Either that or that the author of the grids pushes them directly to the repo, so we guarantee that we are not falling in any form of copyright infringement.

santisoler avatar Sep 02 '22 22:09 santisoler

I get one test dataset, haha... image

LL-Geo avatar Sep 29 '22 07:09 LL-Geo

A quick update the test works fine for un-compressed grd. image

But the default output from oasis montaj looks like a compressed grid, which means we can't read most data shared in .grd format atm... image image

We might need to figure out how compressed grid worked...

LL-Geo avatar Oct 12 '22 10:10 LL-Geo

Awesome @LL-Geo!

So, I would also want to support compressed grids. From what I read in GRD File specification sheet Oasis Montaj uses one of two compression algorithms: zlib and LZRW1. The first one is fairly easy to support using the zlib built-in module. But for LZRW1 things are trickier. I've only found python-lzrw1-kh, but it's a very slow pure Python implementation, not intended to be used on large files. Therefore I would like to support zlib compression for now. I'm curious if that's the default compression algorithm that Oasis uses.

Would you like to save these GRD files in harmonica/tests/data/ and push them? So we can play with them, write tests and see if we can extend the capabilities of this function. Check my previous comment to see what type of grids might be nice to have.

santisoler avatar Oct 14 '22 16:10 santisoler

@santisoler I upload some grd files for testing (the unsigned grd and -1 order grd are a little bit tricky, I think I need some reverse-engineering ways to do that. But it won't be a problem).

It seems the default compress method is LZRW1. I checked different types of compressed grd, they all show COMP_TYPE as 2 (512+4 : 512+8). I also tried the python-lzrw1-kh, and the result does not match the uncompressed file (I didn't dig into that package.. 😆).
image

I upload a small grid (5*5) om_small_compress.grd. I think it will help us to understand how it works. The header start at 512. image

As for the rotation, I think we could just remove the rotation warning and store the unrotated grd. Maybe the grd rotation is more related to Verde?

image

LL-Geo avatar Oct 19 '22 09:10 LL-Geo

@LL-Geo the rotation is probably best handled by multidimensional coordinates in xarray. For example, the main dims of the grid would be the non-rotated x and y coordinates (1D arrays) and then longitude/latitude or easting/northing can be stored as 2D arrays of x, y with the actual rotated coordinate values. xarray knows what to do for indexing and plotting in these cases. See https://docs.xarray.dev/en/stable/examples/multidimensional-coords.html

leouieda avatar Oct 21 '22 13:10 leouieda

This is amazing @LL-Geo! Those hexdumps look so hacky! 😎

I was afraid that the default compression was the LZRW1,so sad. I wouldn't support that kind of compression in this PR. It seems too much of a hassle and there are some useful stuff in here already to block this PR just because of that. So let's leave compression for a future PR.

I think we need to wrap this PR before it sits too long here without being merge. As I said, there are useful stuff here already, and we can always add more later.

For this PR I would do the following:

  • [x] Add tests for the grids we are currently supporting:
    • no-rotated
    • different data types: double, float, long, short
  • [x] Extend support for rotated grids and add tests for them

For future PRs:

  • [x] Add tests for grids with ordering equal to -1
  • [x] Support compressed grids with gzlib
  • ~Support compressed grids with LZRW1~ (there's no need, since compressed grids use gzip even if the header says LZRW1)
  • ~Support other orderings (different than +/- 1)~ (even the docs says they support orderings =! +/-1, Oasis Montaj don't open those grids)

Makes sense? Feel free to tackle any of the first items and push here!

santisoler avatar Oct 26 '22 23:10 santisoler

@LL-Geo today I wrote some tests for this function using the grids you uploaded. Thank you very much!

I think we don't have a grid with ordering = -1. Is that true? If not, could you upload just one (double or floats will be ok) so we can test if my implementation for those is correct?

I was also trying to support the rotated grid and I was facing some issues. The plot you added in your previous comment shows the grid already rotated, but I'm worried that the easting and northing coordinates are not accurate. There's a figure in the GEOSOFT specification sheet (https://help.seequent.com/Oasis-montaj/9.9/en/Content/ss/glossary/grid_file_format__grd.htm) that shows how the rotated grid are encoded. It seems that the x_origin and y_origin correspond to the coordinates of the "left-bottom corner" (would be the left bottom if we consider the unrotated grid). In your plot it has the coordinates (20, -10), while it should have (-21, -12) according to the header of the grid file. How does this rotate grid look in Oasis? Is the figure in the specification sheet accurate or is it wrong?

BTW, the grid is stored already rotated in the file. That's curious because all the nans take a lot of space. I thought that they were storing the unrotated grid values and then the rotation was done in the coordinates, but apparently that's not the case. In that scenario, what @leouieda shared about multidimensional grids with xarray is not needed here: the grid is already registered in the easting and northing coordinates.

I would love to talk about this is a call because it's hard to make sense out of it. So, what if we leave this PR without the support for rotated grids and leave that for a different PR? In that case we would only need to add a test for ordering=-1 in order to merge this one.

santisoler avatar Nov 02 '22 23:11 santisoler

That looks awesome @santisoler ! I make an order=-1 file with a little trick (it's quite uncommon for the grid file with order=-1), and update the test. For the rotation, I think I put a different file from the uploaded data with the picture I showed before. The actual data is not rotated in the GRD file, but the coordinate. The grey is unrotated grd, and the color one is rotated by -30 (om_rodate.grd). I update this file as well.

I guess we just need to project the coordination to make it works. image

For the other goals, Support other orderings (different than +/- 1) I think there is no other ordering that exists. I change the head in the binary file to 2, it can't be loaded by oasis montaj. So we can just remove it. image

LL-Geo avatar Nov 03 '22 05:11 LL-Geo

Thanks @LL-Geo!

For the rotation, I think I put a different file from the uploaded data with the picture I showed before. The actual data is not rotated in the GRD file, but the coordinate. The grey is unrotated grd, and the color one is rotated by -30 (om_rodate.grd). I update this file as well.

I see! So we were right about how to give support for the rotated grids with xarray and multidimensional coordinates. I'll try to upload a few changes to support rotated grids as well.

I make an order=-1 file with a little trick (it's quite uncommon for the grid file with order=-1), and update the test.

Now I'm curious: what was the trick? Isn't an option in Oasis to just change the ordering?

For the other goals, Support other orderings (different than +/- 1) I think there is no other ordering that exists. I change the head in the binary file to 2, it can't be loaded by oasis montaj. So we can just remove it.

Thanks for testing this out! One more thing that we won't have to deal with then! Let's forget about orderings different than +/- 1.

santisoler avatar Nov 11 '22 01:11 santisoler

Now I'm curious: what was the trick? Isn't an option in Oasis to just change the ordering?

Haha, Oasis doesn't have a function to change order. What I did is read the binary file, transpose the main data, and change the head file according to the document file. And I read the file back to Oasis to make sure it was the same.

It has a data table to show the gird orientation: image

I also update the function to read compressed grd files. Inspired by Richard and Mark! https://github.com/Loop3D/geosoft_grid

It turns out the compressed method is Zlib (not LZRW1!) and there is a Unexplained 16 byte header in the compressed data!

I also add and delete some test for ignore the compressed grd, feel free to add them back (like make sure it support 1,2,4,8 + 1024).

I attached one toy data as well. It has 125 compressed blocks. The test data only have 1 block. ADMAP_2_R5.zip

image

@santisoler I'm a bit lazy😆, so I'll leave the rotation coordinate to you, haha. After that I think the function is perfect to go. 😄

LL-Geo avatar Nov 14 '22 13:11 LL-Geo

Good job!

RichardScottOZ avatar Nov 15 '22 08:11 RichardScottOZ

This is amazing @LL-Geo! Huge shout out to @markjessell for noticing that LZRW1 wasn't being use and for those 16 unexplained bytes.

I will work on the coordinates for the rotated grid then. I just pushed a couple of minor changes (mostly refactors).

This is already much better than the first version I drafted! Thanks for your contributions @LL-Geo!

santisoler avatar Nov 15 '22 22:11 santisoler

Actually thanks go to Evren Pakyuz-Charrier at Oslandia who did the detective work on this once I reached a dead end.

Mark Jessell Professor/ Western Australian Fellow Centre for Exploration Targeting/ SES • M006, Perth WA 6009 Australia T +61 8 6488 5803 • E @.***


From: Santiago Soler @.> Sent: Wednesday, November 16, 2022 6:42:18 AM To: fatiando/harmonica @.> Cc: Mark Jessell @.>; Mention @.> Subject: Re: [fatiando/harmonica] Read Oasis Montaj© grd files (PR #348)

This is amazing @LL-Geohttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FLL-Geo&data=05%7C01%7Cmark.jessell%40uwa.edu.au%7Ce30429f2ada34e562e6008dac75aa851%7C05894af0cb2846d8871674cdb46e2226%7C0%7C0%7C638041489430954337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0LGsjrt5CPb67oGQt5%2FXaQy%2Bbi2UT%2B5DTC%2F0Wxo58oA%3D&reserved=0! Huge shout out to @markjessellhttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmarkjessell&data=05%7C01%7Cmark.jessell%40uwa.edu.au%7Ce30429f2ada34e562e6008dac75aa851%7C05894af0cb2846d8871674cdb46e2226%7C0%7C0%7C638041489430954337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=rSkd6U0c7JV67n%2BokGvckdzoNRgWKnz1I6%2F2o8DJgG8%3D&reserved=0 for noticing that LZRW1 wasn't being use and for those 16 unexplained bits.

I will work on the coordinates for the rotated grid then. I just pushed a couple of minor changes (mostly refactors).

This is already much better than the first version I drafted! Thanks for your contributions @LL-Geohttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FLL-Geo&data=05%7C01%7Cmark.jessell%40uwa.edu.au%7Ce30429f2ada34e562e6008dac75aa851%7C05894af0cb2846d8871674cdb46e2226%7C0%7C0%7C638041489430954337%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=0LGsjrt5CPb67oGQt5%2FXaQy%2Bbi2UT%2B5DTC%2F0Wxo58oA%3D&reserved=0!

— Reply to this email directly, view it on GitHubhttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffatiando%2Fharmonica%2Fpull%2F348%23issuecomment-1315965084&data=05%7C01%7Cmark.jessell%40uwa.edu.au%7Ce30429f2ada34e562e6008dac75aa851%7C05894af0cb2846d8871674cdb46e2226%7C0%7C0%7C638041489431110579%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=t1MVKD8RZ6p9ZdNEldWOla1A7waP8crxTy5yWc1AbTU%3D&reserved=0, or unsubscribehttps://aus01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FABUBSYBTS7AWJ6L2A4E7GFDWIQGUVANCNFSM577AKHXQ&data=05%7C01%7Cmark.jessell%40uwa.edu.au%7Ce30429f2ada34e562e6008dac75aa851%7C05894af0cb2846d8871674cdb46e2226%7C0%7C0%7C638041489431110579%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=gQIcQ0mhpDvnn5iuc5BtlBm5d%2BeIurVaQtF7r3qJM5M%3D&reserved=0. You are receiving this because you were mentioned.Message ID: @.***>

markjessell avatar Nov 15 '22 23:11 markjessell

We should send them a xmas hamper!

RichardScottOZ avatar Nov 16 '22 00:11 RichardScottOZ

@LL-Geo I added support for rotated grids and refactor a little bit your code. I also removed all the sample files that weren't being used in the tests. Did you have a particular test for those? If so, feel free to revert that commit if you want them back.

I think this is ready to be merged. Let me know what do you think and if you agree we can merge it right away 🚀

This was really fun to work on and thanks a lot to every one that contributed to it!

santisoler avatar Nov 17 '22 20:11 santisoler

For a 2D case anyway, could use rioxarray possibly to utilise this?

RichardScottOZ avatar Nov 24 '22 23:11 RichardScottOZ

Looks great! Mostly over-my-head, but I read through what I could understand, and tested with a whole bunch of grids I had, all of which worked great. I noticed one parameter of one function didn't have a docstring, so I added a comment for that.

That sounds great! Thanks for the effort of field testing this function.

One additional question/comment. What does the map_projection attribute mean? Most of my geosoft grids have a EPSG (3031) code defining their projection, but this doesn't seem to be preserved in the dataarray attributes. The attribute map_projection for all the grids just has a value of 0.

I noticed it before. The function is just reading the value from the header and it doesn't seem to be always related to the actual projection of the grid. I left it because it wasn't a problem to include it and sounds better to have it than loosing information while reading the file.

I noticed that the associated .xml files for my grids contain the EPSG codes. I wrote the below function to pull the code out. Would it be helpful to have this as part of the load_oasis_montaj_grid function?

def extract_proj_str(fname):
    with open(fname, "r") as f:
        for line in f:
            if "projection=" in line:
                # print lines with projection info
                print(line)

                # extract projection string
                proj = line.split('projection="')[1].split('" ')[0]
                print(proj)

                # remove non-alphanumeric characters if present
                if proj.isalnum() is False:
                    proj = ''.join(filter(str.isalnum, proj))

                assert proj.isalnum
    try:
        proj
    except:
        raise NameError("string 'projection=' not found in file.")
        proj = None

    return proj
fname = "example.grd"
proj = extract_proj_str(fname+".xml")

Here is the portion of the .xml with the EPSG code:

<projection type="POLAR_STEREOGRAPHIC" name="WGS 84 / *EPSG3031" projection="*EPSG3031" ellipsoid="WGS 84" datum="WGS 84" datumtrf="WGS 84" datumtrf_description="[WGS 84] World" radius="6378137" eccentricity="0.0818191908426215">

Thanks for this! I haven't got into the .xml file because I didn't find a reason, but the projection is something important to have in the generated DataArray.

I think we should totally include this into our function. But I'm hesitant to do it in this PR: it already has quite a bunch of commits and I would like to see it merged for the next release.

So I extend the invitation to create a new PR including the capabilities of reading the xml file. Go for it!

santisoler avatar Nov 25 '22 19:11 santisoler

For a 2D case anyway, could use rioxarray possibly to utilise this?

I assume that you are talking about the projection codes, right @RichardScottOZ ? If so, yes, you can use rioxarray to apply projections. You can also use pyproj to define the projection and the verde.project_grid function as an alternative. Checkout this example: Projection of gridded data

santisoler avatar Nov 25 '22 19:11 santisoler

Since we had a couple of positive reviews thanks to @LL-Geo and @mdtanker I'm merging this 🚀 . We have included some untested lines with this PR, but they are not critical. If anyone runs into a problem we can always solve it with more information on what is failing.

Thanks again to everyone that contributed to this PR! I'd love to see more capabilities to it added in the future like the projection information that @mdtanker mentioned.

santisoler avatar Nov 25 '22 19:11 santisoler