trino-python-client icon indicating copy to clipboard operation
trino-python-client copied to clipboard

Add support for polling.

Open matthewwardrop opened this issue 4 years ago • 9 comments

This PR was originally put up against the original prestodb version of this library, which now seems to be defunct; and adds support for polling the status of presto queries, decoupling it from collection of results (fixing #63 on that repository). This is important especially for long-running queries.

Changes:

  • Removed PrestoResult and merged its functionality with PrestoQuery. Keeping both led to some weird logic loops as each tried to keep the other up to date once polling was added.
  • Made PrestoQuery.fetch a private method PrestoQuery._fetch, since calling it directly leads to the iterator not being in sync.
  • Made the PrestoQuery object cache the query results, otherwise there is some weird behaviour whereby if you stop iterating over the rows, you throw away all of the rows left over in the last retrieved chunk.
  • Switched the fetchone / fetchmany relationship so that fetchone is a special case of fetchmany (because the changes above allow multiple iterators to be created over time, so long as there is only one at time, and in this way we don't create a new generator for every row).

matthewwardrop avatar Mar 31 '20 18:03 matthewwardrop

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to [email protected]. For more information, see https://github.com/prestosql/cla.

cla-bot[bot] avatar Mar 31 '20 18:03 cla-bot[bot]

Thank you for your pull request and welcome to our community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. In order for us to review and merge your code, please submit the signed CLA to [email protected]. For more information, see https://github.com/prestosql/cla.

cla-bot[bot] avatar Apr 01 '20 16:04 cla-bot[bot]

@ebyhr There was a trivial oversight in keeping track of progress when requesting one row at a time. I've pushed the fix, and signed the CLA.

matthewwardrop avatar Apr 01 '20 17:04 matthewwardrop

@cla-bot check

martint avatar Apr 01 '20 22:04 martint

The cla-bot has been summoned, and re-checked this pull request!

cla-bot[bot] avatar Apr 01 '20 22:04 cla-bot[bot]

@ebyhr @martint Any updates here?

matthewwardrop avatar May 23 '20 17:05 matthewwardrop

We're running into the same issue with long pulling queries.

Any chance we could see this being merged in a near future?

mathieuseguin avatar Jul 09 '20 13:07 mathieuseguin

@matthewwardrop Are you planning to work on this soon?

amitds1997 avatar Apr 15 '21 15:04 amitds1997

@amitds1997 I'm not sure. My original PRs to the presto-python-client and then this one have "enjoyed" very slow review times, making it challenging to make much headway (by the time a review occurs, I've not thought much about the patch for many months). @findepi's comments were valid ones, but as far as I can tell I already answered these concerns in the original PR description. I'm happy to restage the PR as a series of commits if that is helpful, but note that it won't be entirely straightforward: these changes hang together, and each standalone commit won't make a tonne of sense without the context of the others. I'm also happy if someone else closer to the project wants to reimplement things in a different way, but after a review of this patch I still feel this is a pretty clean implementation. It does make several "unnecessary" changes, but only in the sense that you although you could achieve polling without the other changes, the extra changes prevent weird epicycles in the code and the risk of losing data depending on the order in which you call methods on the class instances.

I originally wrote this patch so I could migrate to this more official client from pyhive, since I thought things would wind down there.... but things on that side have continued kicking along, and their Presto client already features this polling behaviour, so I haven't been too motivated to push hard on this.

matthewwardrop avatar Apr 18 '21 03:04 matthewwardrop