requests-html
requests-html copied to clipboard
Help me understanding the return order of asession.run
from requests_html import AsyncHTMLSession import functools
async def get_link(link): r = await asession.get(link) f = str(r) + link return f
asession = AsyncHTMLSession()
links = [ 'https://google.com', 'https://yahoo.com', 'https://python.org' ]
links = [ functools.partial(get_link, link) for link in links ]
print(links)
results = asession.run(*links)
print(results)
What I get is : [functools.partial(<function get_link at 0x7fa59c438040>, 'https://google.com'), functools.partial(<function get_link at 0x7fa59c438040>, 'https://yahoo.com'), functools.partial(<function get_link at 0x7fa59c438040>, 'https://python.org')] ['<Response [200]>https://python.org', '<Response [200]>https://google.com', '<Response [200]>https://yahoo.com']
So why did the list of asession.run return in wrong order? is there a way to get the result in the same order they being send?
According to line 838 of requests_html.py indeed the return order should match the call order. I don't think the way it is written that is possible. The whole point of acyncio is that one longer operation should not block a faster one. The order of the return depends on which site replies first.
I was using an example to debug a problem of my own, where only ever one call would work, the 2nd would always hang forever. Turns out a problem with an old install of Python 3.8 in Windows. Deleted it all and installed it all again fixed it.
Anyway, for completeness, I took your code and updated it to work round the issue you point out. Though I think this is a bug as the code should behave as it's comments say.
import functools
from requests_html import AsyncHTMLSession
async def get_link(link, results):
r = await asession.get(link)
f = {link: r}
results[link] = (r.url, r)
#return f
asession = AsyncHTMLSession()
links = [
'https://google.com',
'https://yahoo.com',
'https://python.org'
]
results = {}
links = [
functools.partial(get_link, link, results) for link in links
]
print(links)
print(results)
asession.run(*links)
print(results)