geo-deep-learning icon indicating copy to clipboard operation
geo-deep-learning copied to clipboard

Add runtime parameters for inference outputs

Open ymoisan opened this issue 3 years ago • 2 comments

Geo-deep-learning provides sensible defaults to output filenames of the inference process e.g. in the form of appending _inference to the input filename. This is reasonable when running the inference process on local resources but not in the more general case of a remote runtime environment. This ticket is about providing all runtime information items necessary for a remote inference to communicate with external systems where we might want to persist process artifacts, like logs.

Tasks include:

  • [ ] querying for pending inference tasks to be run in remote system
  • [ ] updating production status of tasks

Prerequisites

  • [ ] define runtime parameters stored in a YAML configuration file, mainly full path names for a variety of inference outputs including:

In a later version, those scripts should be decoupled from geo-deep-learning.

ymoisan avatar Feb 28 '22 21:02 ymoisan

Cool @ymoisan Have a look at a first try for inference runtime parameters here Would this be a good start?

remtav avatar Mar 02 '22 19:03 remtav

@remtav interesting. Except for a few parameters in the post-processing section, those are input-related parameters to drive/tailor the inference process e.g. tta or whether we want GPUs etc. We are missing output-related settings. Could that be in an output-params section in your production.yaml file ? Would things like docker_img or singularity_img be in a runtime section, since those are not driving the inference process but rather just providing a computing environment?

We used to call inference.py with 2 arguments : file url (multiband COG) and model url. Now I gather argument # 1 will be an image STAC item url instead. Given we don't have access to the actual filename without opening the STAC item, unless we agree on a "common root" convention e.g. (img) image.tif -> (stac) image.json, then it would be the responsibility of the caller to specify the output name(s) of inference outputs as parameters at runtime.

To be easy on arguments, we could agree on another convention : output filenames will be appended with [raster_inference, vector_inference, generalized_vector inference] or some such by the inference script (before their file extensions of course). Also, the inference process should append an ISO-8601 timestamp to the nearest second in Coordinated Universal Time (UTC) just before the file extension to avoid name collisions. The caller would then pass the image "root name" together with the full path locations for the outputs (products and logs) to the inference script at runtime. How does that sound ?

ymoisan avatar Mar 02 '22 21:03 ymoisan