gs-quant
gs-quant copied to clipboard
Add support for pagination or scrolling in `Dataset.get_data`
Describe the problem. For requests that are apparently too large, the API will return a timeout error. It doesn't seem clear beforehand what exactly will be a request that is too large, and a timeout error is not particularly helpful.
I opened a ticket with Marquee support asking for the best way to do this or a fix for it, but haven't heard back in a couple days.
Describe the solution you'd like The possibly already supported pagination or scrolling could be made accessible in the method. Then, I can just create a wrapper that will just iterate over chunks and combine the results.
Describe alternatives you've considered I have some code that iterates over years, but that sometimes fails. I could do smaller date ranges, but that would be overkill for smaller requests. I think the biggest issue with alternatives is that I don't want to have to chunk before I know when it might fail because each call introduces latency to my code.
Are you willing to contribute Yes
Additional context I can provide examples of requests that timed out if that's helpful, though running the examples might require access to our paid datasets.
Hello @theavey, I would like to contribute to this project by solving this issue. Can I?
I am not an admin of this repo, but that would be great. I've had to implement other workarounds, but a more "native" solution within the package would be helpful
Hey @theavey and @Dhavin, we will look into this request. Currently, our Data APIs don't have a scroll/pagination API. If you are seeing timeouts for larger range queries, we currently recommend making smaller date/time range requests. These queries can be parallelized via threads for potentially significant speed improvements. We also have a utility class (https://github.com/goldmansachs/gs-quant/blob/967e8dd450b07e9e2b8fc0c9b2eec61916a5c179/gs_quant/api/utils.py#L46) that helps manage the threads, sessions, and contexts.