geo-deep-learning inference.py: keep raw heatmap to set confidence level for extracted features after post-processing

inference.py: keep raw heatmap to set confidence level for extracted features after post-processing

Open remtav opened this issue 3 years ago • 3 comments

Inference values for each before using argmax() function will be useful for technicians for sort extracted features by confidence levels. Our inference script should let use decide to output the raw heatmap (after softmax), not just the final one where argmax() has been applied

Dec 15 '21 20:12 remtav

Looks like a debug level setting : info, verbose, etc. Using that kind of terminology for command line arguments -- as opposed to something like heatmap or probability map -- would likely make the use of the script easier. It would be specified somewhere what the "output level" settings map to technically, e.g. verbose means heatmaps are saved together with the final segmentation results.

Heatmap values give indications to technicians as to potential feature type commissions. Would they need some intermediate result to help them with feature delineation too ? I suggest we have them document their needs in this ticket.

Dec 16 '21 14:12 ymoisan

Here's an idea for implementing this:

write raw inference as raster, before argmax but after sigmoid/softmax, as .tif (currently it is written as .dat file with numpy memmap, and is deleted at end of inference)
write the final inference as gpkg
read final inference as geopandas dataframe.
for each feature saved in gpkg:

4.1. get bounds of feature and from those bounds read rasterio window of the raw inference (with per-class, per-pixel confidence levels) 4.2 rasterize the single feature with its bounds (rectangular area) and create numpy mask from output (true where feature is, false where background) 4.3. using numpy, calculate the mean confidence for all pixels where mask values are True 4.4. write that mean value as integer attribute value for a "confidence" attribute

Step 4 could be parallelized using python multiprocessing and would speed up the whole process. See example of multiprocessing implementation in my solaris_tiling branch.

It goes without saying that, in our use case (semantic segmentation), this would be pretty "calculation" and "memory" intensive. Tests need to be done.

Mar 01 '22 19:03 remtav

Could we avoid vectorizing then re-rasterizing by writing a 2-channel raster with prediction class value in channel 1 and a confidence value in channel 2 ? We could use a flag (e.g. verbose=true ?) as a runtime parameter that would mean the raster output would be kept. That parameter would default to false.

Mar 01 '22 20:03 ymoisan

geo-deep-learning geo-deep-learning copied to clipboard

inference.py: keep raw heatmap to set confidence level for extracted features after post-processing

geo-deep-learning
geo-deep-learning copied to clipboard