aiohttp
aiohttp copied to clipboard
Receiving a 404 response on a simple GET request that returns 200 using standard requests library
Describe the bug
I'm not sure what is happening here, but it seems that Twitter is somehow detecting and denying access to requests coming from aiohttp. Running a basic GET request using aiohttp returns a 404 page not found error, while running an identical request with the standard requests module produces the expected results.
To Reproduce
- Replace the url in the standard ClientSession example with "https://twitter.com", change the status assertion to 404 and run the following:
import aiohttp
import asyncio
async def fetch(client):
async with client.get('https://twitter.com/') as resp:
assert resp.status == 404
return await resp.text()
async def main():
async with aiohttp.ClientSession() as client:
html = await fetch(client)
print(html)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
- Run the same request using the standard requests module:
import requests
response = requests.get("https://twitter.com")
print(response.text)
Expected behavior
Both requests should result in a 200 status code, but aiohttp produces a 404 status.
Logs/tracebacks
N/A
Python Version
Python 3.9.6
aiohttp Version
3.8.1
multidict Version
6.0.2
yarl Version
1.7.2
OS
Windows 10
Related component
Client
Additional context
No response
Code of Conduct
- [X] I agree to follow the aio-libs Code of Conduct
I think I remember someone reporting this before. But, probably an issue for Twitter's support. Unless you can figure out what is causing this bizarre response from Twitter, there's not really anything we can do.
Interestingly, /tos and /privacy work fine, but seemingly not any of the main Twitter pages. I'm thinking that static pages are fine, but application pages have some weird logic on them..
Hmm #4926 is the only issue I can find that might be related, maybe I misremembered..
Could be something bizzare similar to https://github.com/aio-libs/aiohttp/issues/5643...
@sc0ned Try setting SSLKEYLOGFILE
while capturing the traffic via tcpdump/wireshark. Then follow https://hynek.me/til/tls-troubleshooting/#bonus-peeking-into-encrypted-tls-traffic / https://gitlab.com/wireshark/wireshark/-/wikis/TLS#tls-decryption. Finally, compare the HTTP requests both libs send.
If they are the same, maybe the problem is indeed on the transport level.
If anyone encounters this; just use httpx.AsyncClient
. The TLS behavior is the same as with requests
and I get a 20x status code as expected.