staged-recipes icon indicating copy to clipboard operation
staged-recipes copied to clipboard

Proposed Recipes for GOES-16 and GOES-17 from AWS

Open rabernat opened this issue 2 years ago • 4 comments

Source Dataset

GOES satellites (GOES-16 & GOES-17) provide continuous weather imagery and monitoring of meteorological and space environment data across North America. GOES satellites provide the kind of continuous monitoring necessary for intensive data analysis. They hover continuously over one position on the surface. The satellites orbit high enough to allow for a full-disc view of the Earth. Because they stay above a fixed spot on the surface, they provide a constant vigil for the atmospheric "triggers" for severe weather conditions such as tornadoes, flash floods, hailstorms, and hurricanes. When these conditions develop, the GOES satellites are able to monitor storm development and track their movements.

  • Link to the website / online documentation for the data
    • https://registry.opendata.aws/noaa-goes/
    • https://github.com/awslabs/open-data-docs/tree/main/docs/noaa/noaa-goes16
  • The file format (e.g. netCDF, csv): netCDF
  • How are the source files organized? (e.g. one file per day): many files per day
  • How are the source files accessed: S3 and HTTPS
    • provide an example link if possible: https://noaa-goes17.s3.amazonaws.com/ABI-L1b-RadC/2018/240/00/OR_ABI-L1b-RadC-M3C03_G17_s20182400027156_e20182400029527_c20182400029559.nc
  • Any special steps required to access the data (e.g. password required): No

Transformation / Alignment / Merging

I believe everything can be stacked into a single massive datacube.

Output Dataset

Zarr or Kerchunk-Zarr

cc @darothen

rabernat avatar May 09 '22 17:05 rabernat

Some additional details:

  1. @blaylockbk has a nice download page here
  2. @blaylockbk also maintains a package at blaylockbk/goes2go/ with programmatic access to the AWS/NOAA serves for this data

I believe everything can be stacked into a single massive datacube.

There's one caveat here which is that the red and blue channels in the visible spectrum are actually double the resolution than the other bands, so you need to account for this in the underlying coordinate system(s) for any catalog which "stacks" the data across time.

Data has a high temporal refresh rate - ~10 minutes for the CONUS and Full Sector imagery.

There are a lot of L2 derivative products but the L1b radiances are the low-hanging fruit here and have significant utility across many, many use cases. Happy to write a few user stories if someone is looking for justification in spending time on this.

It may be worthwhile carving out smaller geographical sectors from the CONUS or Full Sector imagery, given the size of the raw data and downstream use cases.

darothen avatar May 09 '22 18:05 darothen

More thoughts:

  • The very high refresh rate is a perfect use case for the appending capability discussed in https://github.com/pangeo-forge/user-stories/issues/5. It would be awesome to make this a near-real-time recipe. But for the shorter term, simply getting a static recipe working would be best.
  • Given the massive size of the dataset, we definitely don't want to copy the data. We need a kerchunk recipe.

There's one caveat here which is that the red and blue channels in the visible spectrum are actually double the resolution than the other bands, so you need to account for this in the underlying coordinate system(s) for any catalog which "stacks" the data across time.

The way we can handle this today is by simply having different recipes for the different resolution products, and building them to separate datasets. Would that be accetable?

rabernat avatar May 09 '22 20:05 rabernat

The way we can handle this today is by simply having different recipes for the different resolution products, and building them to separate datasets. Would that be accetable?

Sounds good! Simple solutions are always best.

darothen avatar May 09 '22 22:05 darothen

There may be interest in tackling this on @GoogleCloudPlatform, too. Tagging @shanecglass and @alxmrs for visibility.

darothen avatar May 11 '22 17:05 darothen