openeo-python-client icon indicating copy to clipboard operation
openeo-python-client copied to clipboard

[EPIC] MultibackendJobmanager improvement

Open HansVRP opened this issue 11 months ago • 4 comments

In this EPIC we sketch out desired features for the MBJM.

To make the MBJB more performant, the downloading will be split of from track statuses. Apart from that we will make downloading optional and investigate further the need for threading of the download in case the download time is longer than the processing time.

Other requested features in terms of robustness include automatic retrying. Caution need to be taken to not rerun erroneous openEO jobs. As a start we will automatically rerun jobs which failed upon creation or starting.

Desired features:

  • improve performance
  • making the download optional
  • automatic retry of jobs which failed during creation or start
  • [SPIKE] investigate more convenient folder structure for downloaded results

HansVRP avatar Jan 21 '25 18:01 HansVRP

@soxofaan

HansVRP avatar Jan 21 '25 18:01 HansVRP

Hi, I was wondering is it possible to assign specific names to jobs (and thus output files) while using MultiBackendJobManager and not get some generic "job_j-2508150854344c8a86f9c6de635cc887" or "openEO.nc" (e.g. I would like to match the ID of a feature that is being processed for easier interpretation later on)?

I just recently started using openEO, so still not sure to whom to address the question. Apologies and thanks!

milospandzic avatar Aug 15 '25 09:08 milospandzic

Hi @milospandzic , some file formats have the 'filename_prefix' format option: https://documentation.dataspace.copernicus.eu/APIs/openEO/File_formats.html Which would allow to improve output file naming. Would that help?

Format options can be backend specific, the one I am referring to has a forum for questions like this in case you need more help: https://forum.dataspace.copernicus.eu/c/openeo/28

jdries avatar Aug 18 '25 06:08 jdries

Hi @jdries and thanks for answering, I'll check it out. I also found workaround by defining the title in return result.create_job(title=f"Job_{row['id']}") and then using connection.list_jobs() where all job titles and ids are matched I could easily rename everything as I like.

milospandzic avatar Aug 18 '25 11:08 milospandzic