ml-workflow-examples
ml-workflow-examples copied to clipboard
Supervised Classification of Sentinel-1 data
This a test code to use Sentinel-1 data (already available as GeoTiff file for the area of interest) along with labels from Global Forest Watch (available in another GeoTiff) for forest/no forest classification.
Do we have a general preference on whether notebooks should include their output, or be stripped? https://github.com/pangeo-data/ml-workflow-examples/pull/2/files seems to have included outputs.
@HamedAlemo for my own understanding, this bit
This code is written as a test, and ideally there shouldn't be a need to writing these data on the disk and reading them again. Being able to read the source Sentinel-1 imagery (from its native projection), quickly reproject to the labels' grid, and then generate image chips on the fly is a base requirement to be able to scale this training to regional and continental level data.
Is the key challenge you're facing with scaling this workflow right now? Any others I'm missing?
Yes, I should have included the outputs. I'll update them.
Correct, the main challenge is the read the source geotiff, and then write everything as numpy, and then read them again for training. One practical hurdle is the location of input data for orthorectified Sentinel-1 data. In this case I have used Google Earth Engine to extract the data and save them as GeoTiff. After that I use the codes here to process the data, generate training pairs (image+label) and then train the model.
Any reason this has not been merged yet?