openeo-python-client icon indicating copy to clipboard operation
openeo-python-client copied to clipboard

Client Side Processing - ease the "get data" inital part

Open clausmichele opened this issue 1 year ago • 3 comments

After the meeting we had about the work done so far for Client Side Processing, one suggested improvement was to ease the way a user would get the data from a back-end.

A possibility was to use load_result (or load_stac) to load a pre-computed job result:

  • in this case calling .execute() should download automatically the result and proceed with the computation locally, does it make sense? How do you see the integration of this into the client?
  • otherwise, we could think about a utility function wrapping load_collection - download.

@soxofaan let me know what you think about this topic

clausmichele avatar Mar 30 '23 09:03 clausmichele

using load_result/load_stac from a localprocessing connection makes sense I think. However, with that approach, the downloading of data (and where it is stored) will be automatic and consequently somewhat hidden for the user, and I suspect that a user of the localprocessing feature actually wants to inspect the data first before continuing with a localprocessing workflow.

Sketching out a user workflow that could make sense:

from openeo.local import LocalCube

# Download job results (preferably with caching to avoid unnecessary re-downloads)
# to a local folder that must be specified 
cube = LocalCube.from_stac(
    url="https://openeo.example/foo/result/bar",
    local="local/path/for/downloads",
)

# User can inspect downloaded data here, e.g. in QGIS

# Further processing
cube.reduce_dimension(...)
...

soxofaan avatar Mar 31 '23 08:03 soxofaan

Hi @soxofaan, I have a couple of questions:

  • The from_stac call should be converted into an openEO process graph with load_stac + save_result and executed? Or should we just download the data from the provided link? If we go with the second option, it would allow to download also data without a connection to an openEO back-end.

  • A job result from VITO looks like this: immagine

and I guess it's the rendered version of what we would receive doing a GET request on: https://openeo.vito.be/openeo/1.0/jobs/vito-j-f70f03eae96e4fc2ae68f9324d50bc08/results right? So is this the kind of URL that we should provide in the method call? How to handle the authentication part? I would need your help for this.

clausmichele avatar Apr 05 '23 10:04 clausmichele

The from_stac call should be converted into an openEO process graph with load_stac + save_result and executed? Or should we just download the data from the provided link?

I meant just locally downloading the data directly from the (STAC) url. No need for an additional process graph I think.

I guess it's the rendered version of what we would receive doing a GET request on: https://openeo.vito.be/openeo/1.0/jobs/vito-j-f70f03eae96e4fc2ae68f9324d50bc08/results right? So is this the kind of URL that we should provide in the method call? How to handle the authentication part?

Indeed, load_result/load_stac expect a metadata URL like https:/..openeo.../jobs/j-123. However, GET'ting this URL or the listed assets usually requires authentication headers. This is not necessarily a problem, if it's your own batch job, so you could GET everything with an authenticated Connection. Otherwise (e.g. it's not your own batch job), the workaround is to use signed URLs, which are listed as "canonical" link: Screenshot from 2023-04-11 16-20-23

soxofaan avatar Apr 11 '23 14:04 soxofaan