aiohttp icon indicating copy to clipboard operation
aiohttp copied to clipboard

Receiving a 404 response on a simple GET request that returns 200 using standard requests library

Open sc0ned opened this issue 2 years ago • 4 comments

Describe the bug

I'm not sure what is happening here, but it seems that Twitter is somehow detecting and denying access to requests coming from aiohttp. Running a basic GET request using aiohttp returns a 404 page not found error, while running an identical request with the standard requests module produces the expected results.

To Reproduce

  1. Replace the url in the standard ClientSession example with "https://twitter.com", change the status assertion to 404 and run the following:
import aiohttp
import asyncio

async def fetch(client):
    async with client.get('https://twitter.com/') as resp:
        assert resp.status == 404
        return await resp.text()

async def main():
    async with aiohttp.ClientSession() as client:
        html = await fetch(client)
        print(html)

loop = asyncio.get_event_loop()
loop.run_until_complete(main())
  1. Run the same request using the standard requests module:
import requests
response = requests.get("https://twitter.com")
print(response.text)

Expected behavior

Both requests should result in a 200 status code, but aiohttp produces a 404 status.

Logs/tracebacks

N/A

Python Version

Python 3.9.6

aiohttp Version

3.8.1

multidict Version

6.0.2

yarl Version

1.7.2

OS

Windows 10

Related component

Client

Additional context

No response

Code of Conduct

  • [X] I agree to follow the aio-libs Code of Conduct

sc0ned avatar Aug 19 '22 01:08 sc0ned

I think I remember someone reporting this before. But, probably an issue for Twitter's support. Unless you can figure out what is causing this bizarre response from Twitter, there's not really anything we can do.

Interestingly, /tos and /privacy work fine, but seemingly not any of the main Twitter pages. I'm thinking that static pages are fine, but application pages have some weird logic on them..

Dreamsorcerer avatar Aug 19 '22 18:08 Dreamsorcerer

Hmm #4926 is the only issue I can find that might be related, maybe I misremembered..

Dreamsorcerer avatar Aug 19 '22 19:08 Dreamsorcerer

Could be something bizzare similar to https://github.com/aio-libs/aiohttp/issues/5643...

webknjaz avatar Aug 22 '22 01:08 webknjaz

@sc0ned Try setting SSLKEYLOGFILE while capturing the traffic via tcpdump/wireshark. Then follow https://hynek.me/til/tls-troubleshooting/#bonus-peeking-into-encrypted-tls-traffic / https://gitlab.com/wireshark/wireshark/-/wikis/TLS#tls-decryption. Finally, compare the HTTP requests both libs send. If they are the same, maybe the problem is indeed on the transport level.

webknjaz avatar Aug 22 '22 01:08 webknjaz

If anyone encounters this; just use httpx.AsyncClient. The TLS behavior is the same as with requests and I get a 20x status code as expected.

dvdblk avatar Sep 15 '23 21:09 dvdblk