PROJ-data
PROJ-data copied to clipboard
Add NGA EGM 2008 1x1 grid_alternative entry
In order to account for the missing Und_min1x1_egm2008_isw=82_WGS84_TideFree.gz file in the CDN, we need to add a grid_alternative entry to point to the 1' geotiff file.
('Und_min1x1_egm2008_isw=82_WGS84_TideFree.gz','us_nga_egm2008_1.tif','egm08_1.gtx','GTiff','geoid_like',0,NULL,'https://cdn.proj.org/us_nga_egm2008_1.tif',1,1,NULL)
1' file can be found at https://grid-partner-share.s3.amazonaws.com/egm2008/us_nga_egm2008_1.tif
Hi @msmitherdc Who created / what was the creation process for that file ?
This file is 427 MB large. I don't think we want to include that directly in this github repo, nor in the PROJ-data tarball. We can upload it to cdn.proj.org, but should probably have a hint somewhere to keep track of such files that are CDN only. Probably a cdn_only_files.geojson stored in this repository that can be ingested by https://github.com/OSGeo/PROJ-data/blob/master/regenerate_index_html.py to include it in the generated https://github.com/OSGeo/PROJ-data/blob/master/files.geojson . Otherwise people using projsync wouldn't get it. And a mention of it in https://github.com/OSGeo/PROJ-data/blob/master/us_nga/us_nga_README.txt
CC @hobu @kbevers @jjimenezshaw
Looking at the metadata of the 1' file, I see:
AREA_OR_POINT=Point
target_crs_epsg_code=3855
TIFFTAG_COPYRIGHT=Derived from work by USA NGA
TIFFTAG_IMAGEDESCRIPTION=WGS 84 (EPSG::4979) to EGM2008 height (EPSG::3855)
TYPE=VERTICAL_OFFSET_GEOGRAPHIC_TO_VERTICAL
compared to the existing us_nga_egm08_25.tif:
area_of_use=World
AREA_OR_POINT=Point
target_crs_epsg_code=3855
TIFFTAG_COPYRIGHT=Derived from work by NGA. Public Domain
TIFFTAG_DATETIME=2019:12:27 00:00:00
TIFFTAG_IMAGEDESCRIPTION=WGS 84 (EPSG:4979) to EGM2008 height (EPSG:3855). Converted from egm08_25.gtx (last modified at 2018/10/08)
TYPE=VERTICAL_OFFSET_GEOGRAPHIC_TO_VERTICAL
So the optional "area_of_use=World" is missing. And the mention of the license (Public Domain?) in the TIFFTAG_COPYRIGHT too
I believe it was created from the .gtx file and yes, I just meant it to go into the cdn. Its referenced here https://earth-info.nga.mil/php/download.php?file=egm-08interpolation but not included. This is a file we've had for a while and not sure if it came direct from NGA or not. I can work with NGA to see if I can get something official
Hmm looks like the file came from https://s3-eu-west-1.amazonaws.com/download.agisoft.com/gtg/us_nga_egm2008_1.tif https://www.agisoft.com/downloads/geoids/ and it says Converted from [USA NGA](http://earth-info.nga.mil/) data under Public Domain license.
ok, we could potentially recreate the file from the data and hints at https://geographiclib.sourceforge.io/C++/doc/geoid.html#geoidinst since @cffk is a trusted source for that topic!
At first glance, agisoft file seems OK:
gdallocationinfo -wgs84 us_nga_egm2008_1.tif 2 49
Report:
Location: (10920P,2460L)
Band 1:
Value: 44.6771507263184
compared to GeographicLib PGM dataset which has a scale of 0.003 and offset of -108:
gdallocationinfo -wgs84 geoids/egm2008-1.pgm 2 49
Report:
Location: (120P,2460L)
Band 1:
Value: 50892
$ python -c "print 50892 * 0.003 - 108"
44.676000000000016
I’ve reached out to NGA to get any updates direct from them.
Procedure to generate a GeoTIFF file ranging from ~ -180 to ~ 180 deg longitude from the PGM file which ranges from ~ 0 to ~ 360:
gdal_translate geoids/egm2008-1.pgm west.vrt -srcwin 10800 0 10800 10801 -a_ullr -180.00833333333333333 90.008333333333333333 -0.0083333333333333 -90.008333333333333
gdal_translate geoids/egm2008-1.pgm east.vrt -srcwin 0 0 10800 10801
gdalbuildvrt out.vrt west.vrt east.vrt
gdal_translate out.vrt us_nga_egm08_1min.tif -co compress=deflate -co tiled=yes -co num_threads=12 -a_offset -108 -a_scale 0.003 -co predictor=2 -mo AREA_OR_POINT=Point -mo target_crs_epsg_code=3855 -a_srs EPSG:4979 # some missing metadata to add
echo 2 49 0 | ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_1min.tif +multiplier=1
2.0000000000 49.0000000000 44.6760 inf
The resulting file is actually 78 MB large due to using integer storage !
The resulting file is actually 78 MB large due to using integer storage !
What?! the file us_nga_egm08_25.tif
(with 2.5 min step) is 80 MB!
I remember that 3 years ago the file was available in the NGA webpage, and I downloaded it. Unfortunately I deleted it when I changed my laptop. I did compile the fortran code to produce an ascii file from the binary provided.
If the final output is really only 78 MB we could include it. I was expecting something more than 6 times the 2.5' version, that would be around 500 MB! If it is finally ~500 MB, definitely we should do something special for it. (BTW, is it really needed at that 1 minute accuracy?)
If we can compress it that much, should we do the same to existing files? What are the drawbacks?
Having it in the list of alternatives but not in the standard package could be a problem with the option ONLY_BEST
?
the file
us_nga_egm08_25.tif
(with 2.5 min step) is 80 MB!
yes, because if uses Float32 encoding, whereas the PGM file used for the 1' file uses scaled UInt16 data type. And the PREDICTOR=3 mode that applies to floating-point numbers is less efficient than PREDICTOR=2 on UInt16 data. The 2.5'' could also be reduced in a similar manner. The scaled encoding "only" allows a 0.003 m precision on the data, but the GeographicLib page mentions "However, the resulting quantization error (up to 1.5 mm) is typically smaller than the linear interpolation errors."
Having it in the list of alternatives but not in the standard package could be a problem with the option ONLY_BEST?
Well, if the 1' grid is listed in grid_alternatives but not available locally or PROJ_NETWORK is not enabled, then yes when enabling ONLY_BEST, it would logically complain
Here is the official file from NGA s3://grid-partner-share/egm2008/Und_min1x1_egm2008_isw=82_WGS84_TideFree.gz and a presentation about EGM s3://grid-partner-share/egm2008/NPavlis&al_EGU2008.ppt
And I'm sorry that the file is 800Mb.
Thanks @msmitherdc
How can I download that s3
url?
https://grid-partner-share.s3.amazonaws.com/egm2008/Und_min1x1_egm2008_isw%3D82_WGS84_TideFree.gz but there's indeed a permission issue. Anyway I don't think that file is directly usable as it must contain some "harmonics coefficients" or things like that, and that's GeographicLib that has done the heavy work of converting that to a grid
Permissions should be fixed
If I understood the format correctly, this script should convert to gtx (actually the data is pretty similar)
import struct
def main(src, dst):
cols = 21600
rows = 10801
with open(dst, 'wb') as gtx, open(src, 'rb') as infile:
gtx.write(struct.pack('>ddddii', -90, 0, 1.0/60, 1.0/60, rows, cols))
rs = []
for count in range(rows):
infile.read(4)
line = infile.read(cols*4)
if not line:
return 1
rs.append(line)
infile.read(4)
for row in reversed(rs):
gtx.write(row)
return 0
if __name__ == "__main__":
src = 'Und_min1x1_egm2008_isw=82_WGS84_TideFree'
dst = 'egm2008.gtx'
exit(main(src, dst))
The file does not contain the harmonics, but the geoid undulations directly. Before and after each row it has 4 bites with the integer value 86400.
If I understood the format correctly, this script should convert to gtx (actually the data is pretty similar)
cool! I assumed it wasn't a grid... and do you get the same values as https://grid-partner-share.s3.amazonaws.com/egm2008/us_nga_egm2008_1.tif / https://s3-eu-west-1.amazonaws.com/download.agisoft.com/gtg/us_nga_egm2008_1.tif ?
Those tif files that you mention @rouault have 21601 columns, while the original has 21600.
In the file README_WGS84_2.pdf
from https://earth-info.nga.mil/php/download.php?file=egm-08interpolation it says:
(3) Und_min1x1_egm2008_isw=82_WGS84_TideFree.gz
This file holds the 1'x1' geoid undulations in sequential, unformatted, binary format, with each geoid undulation stored as a single, REAL*4 value. Each record contains all of the geoid undulations for a single parallel band. Each geoid undulation is situated at the corner of its respective cell, such that the top-left value in the 1'x1' grid has a longitude of 0° East and a latitude of 90° North. The first record in the file contains the northern-most parallel, and the first value in each record is the western-most value for that parallel. Note that each 1'x1' value situated on the zero meridian appears only once, as the first value in its respective record, at a longitude of 0° East. These values are NOT repeated at the end of their respective records, at longitude of 360° East. As such, this file contains 10801 rows x 21600 columns of geoid undulations.
And some stats:
Global statistics (meters) for this file are: Number of Values: 233301600 Percentage of Area: 100.000 Minimum Value: -106.910 Latitude of Minimum: 4.683 Longitude of Minimum: 78.767 Maximum Value: 85.840 Latitude of Maximum: -8.400 Longitude of Maximum: 147.367 Arithmetic Mean: -1.317 Area-Weighted Mean: -0.463 Arithmetic RMS: 29.274 Area-Weighted RMS: 30.596 Arithmetic S.Dev.: 29.244 Area-Weighted S.Dev.: 30.593
Apart from that, comparing with QGIS they have the same values (in the common columns).
Comparing with the file egm2008-1.pgm
the maximum difference is 1.5 mm
(I'm always unsure about center or corner of the pixel. Also in the python script I did to convert to gtx. Anybody with more knowledge should check that, given the description I pasted above)
I'm always unsure about center or corner of the pixel.
This data is "pixel is point". I.e., the data give the heights at the corners of the cells (at lat + lon = integer multiples of 1' in this case).
Is there some agreement that the 1.5 mm max difference due to quantization if using a UInt16 data type + offset/scale is acceptable ?
For me 1.5 mm is fine. I asked a fried from the university and agrees. The model is worse than that. Another question is... do we need the version with 1' when we already have it with 2.5'?
do we need the version with 1' when we already have it with 2.5'?
Taking a few "random" points, when doing bilinear interpolation, the difference between both versions is about within +/- 1.5 mm quantization error... It might be that I'm just out of luck, and it might be possible that in some areas, there is a significant difference... @msmitherdc Did you find that the 1' model was leading to more accurate results than the 2.5' one ?
($ echo 2.5 49.5 0 | ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_1min.tif +multiplier=1
2.5000000000 49.5000000000 44.7390 inf
($ echo 2.5 49.5 0 | PROJ_NETWORK=ON ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_25.tif +multiplier=1
2.5000000000 49.5000000000 44.7397 inf
($ echo -100.1 40.1 0 | ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_1min.tif +multiplier=1
-100.1000000000 40.1000000000 -24.7800 inf
($ echo -100.1 40.1 0 | PROJ_NETWORK=ON ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_25.tif +multiplier=1
-100.1000000000 40.1000000000 -24.7784 inf
($ echo -110.1 45.1 0 | ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_1min.tif +multiplier=1
-110.1000000000 45.1000000000 -7.5630 inf
($ echo -110.1 45.1 0 | PROJ_NETWORK=ON ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_25.tif +multiplier=1
-110.1000000000 45.1000000000 -7.5642 inf
($ echo 130.1 -50.1 0 | ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_1min.tif +multiplier=1
130.1000000000 -50.1000000000 -22.0530 inf
($ echo 130.1 -50.1 0 | PROJ_NETWORK=ON ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_25.tif +multiplier=1
130.1000000000 -50.1000000000 -22.0523 inf
$ echo 130.1 50.1 0 | ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_1min.tif +multiplier=1
130.1000000000 50.1000000000 14.9640 inf
$ echo 130.1 50.1 0 | PROJ_NETWORK=ON ~/proj/proj/build_cmake/bin/cct +proj=vgridshift +grids=us_nga_egm08_25.tif +multiplier=1
130.1000000000 50.1000000000 14.9657 inf
actually, my issue was that reading a EGM08 vertical datum file via PDAL was returning this message
(pdal info readers.las Error) GDAL failure (1) PROJ: Cannot open https://cdn.proj.org/Und_min1x1_egm2008_isw=82_WGS84_TideFree.gz: HTTP error 404: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>Und_min1x1_egm2008_isw=82_WGS84_TideFree.gz</Key><RequestId>JA240W92M02FW4GK</RequestId><HostId>DCOiZUnwWIQdxL8bK9wkprDQiq7kvM1q9ujL2qaCbinfCZ6fbcDN4C6glQQDTtKO7nE1OgYvUEo=</HostId></Error>
(pdal info readers.las Error) GDAL failure (1) PROJ: Cannot open https://cdn.proj.org/Und_min1x1_egm2008_isw=82_WGS84_TideFree.gz: HTTP error 404: <?xml version="1.0" encoding="UTF-8"?>
<Error><Code>NoSuchKey</Code><Message>The specified key does not exist.</Message><Key>Und_min1x1_egm2008_isw=82_WGS84_TideFree.gz</Key><RequestId>JA240W92M02FW4GK</RequestId><HostId>DCOiZUnwWIQdxL8bK9wkprDQiq7kvM1q9ujL2qaCbinfCZ6fbcDN4C6glQQDTtKO7nE1OgYvUEo=</HostId></Error>
actually, my issue was that reading a EGM08 vertical datum file via PDAL was returning this message
ah, then it is something else... This error is a bit surprising as PROJ shouldn't try to fetch from the CDN a grid it doesn't know in grid_alternatives. I can't reproduce with:
echo 49 2 0 | PROJ_NETWORK=ON PROJ_DEBUG=ON bin/cs2cs EPSG:4326+3855 EPSG:4979
It tries to access a local Und_min1x1_egm2008_isw=82_WGS84_TideFree.gz file, but doesn't try to get it from the CDN
the geotiff keys on the file that triggers the lookup look like
Geotiff_Information:
Version: 1
Key_Revision: 1.0
Tagged_Information:
End_Of_Tags.
Keyed_Information:
GTModelTypeGeoKey (Short,1): ModelTypeProjected
GTRasterTypeGeoKey (Short,1): RasterPixelIsPoint
ProjectedCSTypeGeoKey (Short,1): PCS_WGS84_UTM_zone_18N
ProjLinearUnitsGeoKey (Short,1): Linear_Meter
VerticalCSTypeGeoKey (Short,1): Code-3855 (EGM2008 height)
VerticalDatumGeoKey (Short,1): Code-1027 (EGM2008 geoid)
VerticalUnitsGeoKey (Short,1): Linear_Meter
Unknown-32777 (Ascii,6): "G1674"
End_Of_Keys.
End_Of_Geotiff.
the geotiff keys on the file that triggers the lookup look like
This should be reported as a PDAL issue with precise instructions. It might be a PROJ issue, or a GDAL issue or a PDAL issue, but that's unclear at that point.
My hunch is it is a libgeotiff issue, since we're just handing it all off to that in this case.