tornado
tornado copied to clipboard
Try to skip getaddrinfo if "host" is already an IP
TCPClient always calls resolver.resolve(
even if "host"
is already an IP
https://github.com/tornadoweb/tornado/blob/00c9e0ae31a5a0d12e09109fb77ffe391bfe1131/tornado/tcpclient.py#L260-L265
asyncio has a nice implementation that skips the getaddrinfo call https://github.com/python/cpython/blob/b5527688aae11d0b5af58176267a9943576e71e5/Lib/asyncio/base_events.py#L1369-L1380 in this case
Supporting code is dozens of lines, which seems like a lot. How much difference does it make? I guess it must be substantial since this got merged into the standard library, but I'm surprised. I wonder if this could be moved from its current location into loop.getaddrinfo
itself?
@bdarnell here's the associated PR: https://github.com/python/asyncio/pull/302 I think it's substantial as a redundant trip around run_in_executor
is pretty slow
Thanks for that link. My question was really about teasing out where exactly the slowness is - is it about using any await
at all, or run_in_executor
, or is it about the performance difference between getaddrinfo
and inet_pton
? From the linked python-tulip@ thread, it looks like the original motivation was about holding a lock in run_in_executor, but it was later found that inet_pton
is indeed much faster than getaddrinfo
in this case (at least on 2015 versions of macOS) that the change is worthwhile.
I asked this question because depending on the answer, it seems like the optimization could be built in to asyncio.loop.getaddrinfo itself instead of being duplicated in Tornado. And indeed, python/asyncio#302 did this. However, it was reversed in python/asyncio#357: inet_pton
only works for AF_INET
and AF_INET6
sockets (and not, for example, AF_BLUETOOTH
). It appears that the conclusion here is that loop.getaddrinfo
needs to support arbitrary address families, but loop.create_connection
is free to assume AF_INET{,6}
, so the inet_pton
optimization is now only used by the latter.
So by the same logic, this optimization should exist in TCPClient and not in Resolver (just as you have it in #3116). The amount of code duplication is unfortunate but I doubt its worth trying to get a public version of _ensure_resolved
/_ipaddr_info
added at this point. I wonder, though, whether another simplification might be possible now that the getaddrinfo
lock has been removed on all (?) major platforms:
async def getaddrinfo(host, port, ..., flags):
try:
return socket.getaddrinfo(host, port, ..., flags | socket.AI_NUMERICHOST)
except SomeSocketException:
return self.run_in_executor(self.getaddrinfo, host, port, ..., flags)
That is, run getaddrinfo once with AI_NUMERICHOST without an executor for the case in which the requested host is already resolved, and only if that fails run regular getaddrinfo in an executor. I'm sure this gets a little more complicated than my 5-line sketch but should still be simpler than completely duplicating _ipaddr_info
. Maybe I'm being optimistic about the state of the various resolver libraries, though.
@ajdavis Anything to add?
It’s been a long time since I did the asyncio patch. I remember thinking at the time it was a surprising amount of code duplication but the Python core team decided it was acceptable. Yes, the getaddrinfo lock is gone on major platforms now, that might or might not affect your decisions here. 😉
Thanks for that link. My question was really about teasing out where exactly the slowness is - is it about using any
await
at all, orrun_in_executor
, or is it about the performance difference betweengetaddrinfo
andinet_pton
? From the linked python-tulip@ thread, it looks like the original motivation was about holding a lock in run_in_executor, but it was later found thatinet_pton
is indeed much faster thangetaddrinfo
in this case (at least on 2015 versions of macOS) that the change is worthwhile.I asked this question because depending on the answer, it seems like the optimization could be built in to asyncio.loop.getaddrinfo itself instead of being duplicated in Tornado. And indeed, python/asyncio#302 did this. However, it was reversed in python/asyncio#357:
inet_pton
only works forAF_INET
andAF_INET6
sockets (and not, for example,AF_BLUETOOTH
). It appears that the conclusion here is thatloop.getaddrinfo
needs to support arbitrary address families, butloop.create_connection
is free to assumeAF_INET{,6}
, so theinet_pton
optimization is now only used by the latter.So by the same logic, this optimization should exist in TCPClient and not in Resolver (just as you have it in #3116). The amount of code duplication is unfortunate but I doubt its worth trying to get a public version of
_ensure_resolved
/_ipaddr_info
added at this point. I wonder, though, whether another simplification might be possible now that thegetaddrinfo
lock has been removed on all (?) major platforms:async def getaddrinfo(host, port, ..., flags): try: return socket.getaddrinfo(host, port, ..., flags | socket.AI_NUMERICHOST) except SomeSocketException: return self.run_in_executor(self.getaddrinfo, host, port, ..., flags)
That is, run getaddrinfo once with AI_NUMERICHOST without an executor for the case in which the requested host is already resolved, and only if that fails run regular getaddrinfo in an executor. I'm sure this gets a little more complicated than my 5-line sketch but should still be simpler than completely duplicating
_ipaddr_info
. Maybe I'm being optimistic about the state of the various resolver libraries, though.@ajdavis Anything to add?
fyi the trio implementation uses this AI_NUMERICHOST trick also: https://github.com/python-trio/trio/blob/4edfd41bd5519a2e626e87f6c6ca9fb32b90a6f4/trio/_socket.py#L125-L192
It’s been a long time since I did the asyncio patch. I remember thinking at the time it was a surprising amount of code duplication but the Python core team decided it was acceptable. Yes, the getaddrinfo lock is gone on major platforms now, that might or might not affect your decisions here. wink
I've made a PR to port the AI_NUMERICHOST | AI_NUMERICSERV shortcut from trio to asyncio here: https://github.com/python/cpython/pull/31497
OK, so if that cpython patch is accepted then it'll be solved in getaddrinfo and we don't have to do anything on the tornado side. (we could do a similar patch in the tornado resolver to get the benefit sooner but then we'd also have to make sure that there's no locking around getaddrinfo on older python versions that tornado still supports)
But I'm not sure that patch should be accepted - Jesse's benchmarking found that getaddrinfo with AI_NUMERICHOST was much slower than inet_pton on macos in 2015. Unless that's changed, the trio patch would be a performance regression, so it might be better to leave it as is. That would mean we'd still need something like #3116 to get this performance benefit for tornado.
Given the amount of subtleties we've uncovered here and the lack of concrete benchmarks showing improvements in Tornado functionality, I'm inclined to keep things as they are and not worry about making this optimization.
@bdarnell I did some benchmarks against a fresh Macos runner and it seems the issues in getaddrinfo are fixed https://github.com/python/cpython/issues/90980#issuecomment-1133141676
OK, so given those performance results I'm in favor of the AI_NUMERICHOST|AI_NUMERICSERV
technique. Ideally python/cpython#31497 would be merged so that could go into asyncio itself, but I'd also accept a similar patch to Tornado in the meantime. I'm going to close #3116 because I don't think we want to duplicate that version of the _ipaddr_info
function here.