cyfi icon indicating copy to clipboard operation
cyfi copied to clipboard

Mask to specific waterbody extent

Open NickSievert opened this issue 11 months ago • 5 comments

If I'm interpreting the documentation correctly, I believe the estimates are derived from all water pixels within the 2000 meter bounding box. In many cases (when working with small waterbodies) multiple, distinct, waterbodies may fall within the bounding box associated with a single point. Is it possible to apply a masking functionality so that rather than returning results for all cells within the bounding box only values associated with a specific, defined waterbody are returned?

NickSievert avatar Feb 14 '25 14:02 NickSievert

I believe the estimates are derived from all water pixels within the 2000 meter bounding box

@NickSievert yep you are correct this is a current limitation and is especially impactful for smaller water bodies near other water bodies. One option is to limit to contiguous water pixels though this gets tricky because some neighboring water pixels may have cloud cover. I think a more robust option is to mask to pixels within the water body boundary as you suggest. It would be useful if we had a fixed asset that defined water body boundaries across the U.S. (since that's where all the training data is from). From a cursory search, ARCGIS has a USA Detailed Water Bodies shapefile but I'm not sure if that will be accurate down to the river and reservoir level.

I'm curious where you're looking to use CyFi and if you have boundaries for your water bodies of interest?

ejm714 avatar Feb 14 '25 18:02 ejm714

Thanks for your prompt response. I've been using CyFi to develop a historical dataset and monitoring program for waterbodies managed by the Missouri Department of Conservation (my employer). As you mentioned in your response, many of the waterbodies we manage are quite small (but large enough to be resolved with Sentinel 2 imagery), and near enough to one another that multiple waterbodies fall within a single 2000 meter bounding box placed at a given waterbody centroid.

We have boundaries for all of our waterbodies of interest, however the dataset we use is specific to the state of Missouri. At a national level the National Hydrography Dataset has a waterbody layer that would provide a useful, universal (USA only) waterbody polygon layer. (This wouldn't capture rivers/streams unfortunately, however that could potentially be addressed with incorporation of the NHDArea layer).

https://hydro.nationalmap.gov/arcgis/rest/services/nhd/MapServer/10

The layer should be viewable on the national map: https://apps.nationalmap.gov/viewer/

Really appreciate your help!

NickSievert avatar Feb 14 '25 18:02 NickSievert

@NickSievert Thanks for the links! We don't currently have funded development time for this but I'll try to take a quick pass at a low lift version. That said, I would love to hear more about the Missouri Department of Conservation use case and see if this is something we could collaborate on. For example, we (DrivenData) offer custom model training and model pipeline productionization services so let me know if that's of interest!

ejm714 avatar Feb 28 '25 22:02 ejm714

@ejm714 Thanks for reaching out. Completely understand the development time challenge.

Given the issues associated with the inclusion of all surface waters within the bounding box rather than tying predictions to defined waterbodies CyFI won't work for our purposes (and is an issue I worry may be overlooked by other users of your tool).

I'm working on developing a solution for our agency which will likely utilize the harmonized landsat/sentinel-2 product to develop waterbody specific bloom histories for 1000+ waterbodies in our state and a pipeline that will monitor newly collected imagery data for potential active blooms. I appreciate your offer to collaborate and would certainly appreciate any expertise or services you all could provide but at this time we do not have any funds available to support this effort (aside from my time).

NickSievert avatar Mar 19 '25 12:03 NickSievert

@chrisjkuch here's a couple brain dump rough notes on this potential feature:

  • as a first pass, we can used a fixed shapefile asset that contains US water bodies (similar to what we do for land cover)
    • we should look to see what the minimum water body size it captures it (and if it captures rivers and reservoirs)
  • should have a try/except for looking up the water body in which the sampling point is located (we may want to add some margin error since sample points can be on docks, meaning right on the water boundary)
  • here is where we limit just to the water pixels in the image; this is where we would need the additional mask based on the water boundary

ejm714 avatar Apr 29 '25 03:04 ejm714