pyvo icon indicating copy to clipboard operation
pyvo copied to clipboard

expose timeout parameter in highest level search interfaces

Open RayPlante opened this issue 11 years ago • 10 comments

The low level query execution methods in the *Query classes support a timeout parameter which will kill a query that takes longer than a certain time. However, this control is not exposed at the higher-level and more often-used query interfaces. This parameter should be supported all the way up.

RayPlante avatar Jul 02 '13 15:07 RayPlante

(I dont know if here is the best place for this kind of question...) Hi Ray. I have to deal with the timeout length in my code, and since it is listed here as an issue, probably I could contribute with a solution. So far I am querying conesearch service servers; in pyvo the code for the searching goes down to the query.DALQuery.execute_stream() method, where it than uses the urllib2 package. I implemented a very simple solution for this particular case which gives the user an interface at the query.py level. What I did is commited in the fork a have: https://github.com/chbrandt/pyvo (970dde3b01e1518093c9ac2c62623b169c451d8d). I would like to know if this is a viable solution for pyvo? If it is not, what would be a better solution? []

chbrandt avatar Jun 10 '14 12:06 chbrandt

(I dont know if here is the best place for this kind of question...)

(Perfect place.)

I implemented a very simple solution for this particular case which gives the user an interface at the query.py level. What I did is commited in the fork a have: https://github.com/chbrandt/pyvo (970dde3).

I do like this, particularly if you think this works well for your use case.

My original notion was to provide a timeout argument to the search functions; however, I think your global approach would probably be the more commonly useful. (That is, it would be rarer to want to set the timeout on a per-service basis, but not inconceivable.) Both approaches can certainly co-exist, and I suggest we do that for integrating getdataset() and cachedataset(): if timeout is not provided as an argument, the value provided by the global parameter applies.

Go ahead and do a pull request, either as-is or with changes to getdataset() and cachedataset(). We should also document in get/setparam() what parameter names are supported and what they do.

I need to do a mental sticky to think about the phase of development for tighter integration with Astropy, and whether/how we should engage the Astropy configuration framework.

RayPlante avatar Jun 10 '14 13:06 RayPlante

@chbrandt (Hey--I just connected that we met at ESAC. thanks for the code!)

RayPlante avatar Jun 10 '14 13:06 RayPlante

Indeed! My pleasure to contribute! I just submitted the pull request. I added the modification to cachedataset() (the global timeout is an alternative) and put a few words on the documentation to point out the ('timeout') parameters accepted. I added a long comment on the request trying to be clear regarding the implementations and doubts that remain. []

chbrandt avatar Jun 11 '14 10:06 chbrandt

My idea was to expose the **kwargs from the requests library in the higher level interfaces, but this interferes with the additional query parameters in some of the interfaces (e.g. SIA)

funbaker avatar Feb 14 '17 14:02 funbaker

@funbaker I saw there is a timeout parameter at query.py::672 where eventually every request will pass through, afaiu. I couldn't figure out a way to give such a value when (for instance) calling the regsearch function.

chbrandt avatar Dec 08 '17 12:12 chbrandt

Adding to the discussion

I found time_decorator to be extremelly handy here. Since I am interested in setting a general timeout for the whole "get" process (accessing a server plus downloading its data), I solved my issue by decorating the function where I call pyvo (search).

I first tried to set a global timeout for requests, but that is not yet possible. They have that issue open so far, with milestone for version 3.0.

chbrandt avatar Dec 08 '17 14:12 chbrandt

@chbrandt getdataset is for downloading e.g. FITS files from a single result row.

Astropy uses a parameter + config variable for http requests. https://astropy.readthedocs.io/en/stable/api/astropy.utils.data.get_pkg_data_filename.html#astropy.utils.data.get_pkg_data_filename

remote_timeout should be sufficient enough not to interfere with possible VO parameters.

funbaker avatar Dec 08 '17 19:12 funbaker

Thanks @funbaker. But I'm a bit lost now... Taking the regsearch function as an use case: where do I give remote_timeout (in this case) to pyvo?

chbrandt avatar Dec 08 '17 20:12 chbrandt

Not yet implemented. Excuse me if this was unclear.

funbaker avatar Dec 08 '17 20:12 funbaker