azure-sdk-for-python icon indicating copy to clipboard operation
azure-sdk-for-python copied to clipboard

Result of `query_entities` of `TableClient` cannot be returned in `with` keyword successfully

Open matt-mi opened this issue 2 years ago • 9 comments

  • Package Name: azure-data-tables
  • Package Version: 12.4.0
  • Operating System: macos
  • Python Version: 3.9.13

Describe the bug Result of query_entities of TableClient cannot be returned in with keyword

To Reproduce Steps to reproduce the behavior:

  1. create a record in table store. E.g., a row has value cluster4 under RowKey.
  2. Query the record from table store by following
def get_cluster():
    cluster_name_filter = "RowKey eq 'cluster4'"
    with TableClient(endpoint=sta_endpoint, table_name=table_name, credential=credential) as table_client:
        return table_client.query_entities(query_filter=cluster_name_filter)

results = get_cluster()
for i in results:
    print(i['RowKey'])
  1. Execute code snippet above and got exception below
(.venv) ➜  /cronjob/.venv/bin/python scheduler.py
Traceback (most recent call last):
  File "/cronjob/scheduler/scheduler.py", line 135, in <module>
    for i in results:
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/paging.py", line 128, in __next__
    return next(self._page_iterator)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/paging.py", line 76, in __next__
    self._response = self._get_next(self.continuation_token)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/data/tables/_models.py", line 363, in _get_next_cb
    return self._command(
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/data/tables/_generated/operations/_table_operations.py", line 380, in query_entities
    pipeline_response = self._client._pipeline.run(request, stream=False, **kwargs)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/_base.py", line 211, in run
    return first_node.send(pipeline_request)  # type: ignore
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  [Previous line repeated 3 more times]
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/policies/_redirect.py", line 158, in send
    response = self.next.send(request)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/data/tables/_policies.py", line 201, in send
    response = self.next.send(request)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/_base.py", line 71, in send
    response = self.next.send(request)
  [Previous line repeated 1 more time]
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/_base.py", line 103, in send
    self._sender.send(request.http_request, **request.context.options),
  File "/cronjob/.venv/lib/python3.9/site-packages/azure/core/pipeline/transport/_requests_basic.py", line 327, in send
    response = self.session.request(  # type: ignore
AttributeError: 'NoneType' object has no attribute 'request'

Expected behavior Results can be returned successfully from with keyword

Screenshots If applicable, add screenshots to help explain your problem.

Additional context If I'm not using with to automatically create and close TableClient, results can be returned and lopped successfully.

matt-mi avatar Aug 10 '22 03:08 matt-mi

Label prediction was below confidence level 0.6 for Model:ServiceLabels: 'Tables:0.4404969,Storage:0.094060935,Azure.Core:0.06135202'

azure-sdk avatar Aug 10 '22 03:08 azure-sdk

Thanks @matt-mi! This is an interesting scenario :)

The context manager of the TableClient will close the connection - though you are correct that this will leave the auto-iterating query response with a severed connection that will fail when it needs to collect the next page of results. On the one hand, this is the expected behaviour. On the other hand - that is a pretty awful stacktrace/error message, so we should look at fixing that!

There are two ways you could resolve this. The first and easiest is to simply use list() to fully iterate the query before exiting the TableClient context:

def get_cluster():
    cluster_name_filter = "RowKey eq 'cluster4'"
    with TableClient(endpoint=sta_endpoint, table_name=table_name, credential=credential) as table_client:
        return list(table_client.query_entities(query_filter=cluster_name_filter))

results = get_cluster()
for i in results:
    print(i['RowKey'])

However the above solution will lose the incremental nature of the query, and if there's a lot of data to return this could be very slow. So the second option is to turn the entire function into a generator that will yield the results and therefore both iteratively download the data, but also not close the connection until the iteration is complete:

def get_cluster():
    cluster_name_filter = "RowKey eq 'cluster4'"
    with TableClient(endpoint=sta_endpoint, table_name=table_name, credential=credential) as table_client:
        for result in table_client.query_entities(query_filter=cluster_name_filter):
            yield result

results = get_cluster()
for i in results:
    print(i['RowKey'])

Hopefully one of these two options will fix your current error, and we will look at improving the failure experience for when the connection is closed on an orphaned iterator.

@iscai-msft, @xiangyan99 - do you know if we have tests for this scenario in the azure-core tests for ItemPaged?

annatisch avatar Aug 10 '22 04:08 annatisch

Thanks @annatisch for your prompt response on this!

There's one thing I'm kind of confused about, which is for example if I was trying to manually create and close the TableClient connection, it's also working ..

def get_cluster():
    cluster_name_filter = "RowKey eq 'cluster4'"
    table_client = TableClient(endpoint=sta_endpoint, table_name=table_name, credential=credential)
    results = table_client.query_entities(query_filter=cluster_name_filter)
    table_client.close()
    return results

for i in get_cluster():
    print(i['RowKey'])

I'm not quite sure what's the difference and forgive me about my limited knowledge in python ...

matt-mi avatar Aug 10 '22 04:08 matt-mi

Thanks @matt-mi - that is.... very interesting indeed. I've repro-ed the same behaviour and having a dig around now.

annatisch avatar Aug 10 '22 04:08 annatisch

Ah - I have discovered the reason - and there's possibly something we could look at fixing as well... though might need more discussion.

In this case, using the context manager - i.e. the with syntax, you are pre-emptively opening the connection pool, even though no request has yet been made. And in fact, this get_cluster function is not making a single request to the service, which is not done until we attempt to iterate on the results. In other words, when opened in a context manager, that connection is then severed with the return statement.

However in the second example, simply creating the TableClient does not open the connection - this is done lazily on the first request. So the client.close() that's being called, isn't actually closing anything, because it was never opened. This call to close is being completely disregarded when the first actual service call is made outside of the function, and the client is treating this as being opened for the first time. Therefore - at the end of the script, the connection pool is not being cleaned up and the connection remains open.

@xiangyan99 - it looks like we have two questions for core here:

  1. We should improve the error message when we attempt to use a transport that has been closed.
  2. When we call close on a client that hasn't yet been opened - that should probably prevent it from ever being opened so that cases like this don't result in uncleaned up resources. Though we might need to look into whether this could break anyone.

annatisch avatar Aug 10 '22 05:08 annatisch

Thank you for your detailed explanation Anna! I thought the connection was already initiated when a TableClient spawned, turns out when the item is being iterated then connection is initiated ...

So in this case if I don't want to use with keyword then I would need to close the client (table_client.close()) after everything is iterated right?

matt-mi avatar Aug 10 '22 06:08 matt-mi

Yes @matt-mi - that is correct, so the following (adding list() - which forces the full iteration up-front) would close the connection:

def get_cluster():
    cluster_name_filter = "RowKey eq 'cluster4'"
    table_client = TableClient(endpoint=sta_endpoint, table_name=table_name, credential=credential)
    results = list(table_client.query_entities(query_filter=cluster_name_filter))
    table_client.close()
    return results

for i in get_cluster():
    print(i['RowKey'])

Edit: Though I should add that using the with syntax is still preferable - as it means the connection will be closed even in the case of a failure (for example, an intermittent network failure, or, say, a typo in the table name resulting in a ResourceNotFound error etc).

annatisch avatar Aug 10 '22 06:08 annatisch

Awesome, I will try to adapt with syntax to my code. Thanks again @annatisch, you save my day!

matt-mi avatar Aug 10 '22 06:08 matt-mi

You're very welcome @matt-mi! And thank you so much for reporting this!

annatisch avatar Aug 10 '22 06:08 annatisch

I found this issue after I got confused by your documentation.

  • your toplevel readme doesn't use with contexts in the examples.
  • your examples use the with statements.

As a user, I now don't know which way would be correct (with with or without with), and whether there actually is some stuff that needs to be cleaned up - or if the exiting the context has no effect in general. Also, it would be nice to know if above changes when switching from sync to async.

Having to use with statements or not can have quite an impact on the design of our application.

IMHO: If there is stuff that needs to be cleaned up, then one should have to use the with statement - otherwise users will experience weird bugs, unsubmitted requests or whatever that are hard to debug. I would "simply" prevent usage of the clients without a context. Same as with open its just really bad practice to not use a context for that. - On the other hand this will probably break the code of thousands of users xD

martinitus avatar Jul 19 '23 14:07 martinitus