requests icon indicating copy to clipboard operation
requests copied to clipboard

Endless history: the history contains a self-reference

Open CachingFoX opened this issue 2 years ago • 6 comments
trafficstars

The history of a requests contains a self-reference to the history owner. The history will be endless.

Expected Result

If I traverse recursive the complete history of a requests, this will be a finally graph. The history of a request is a tree without cycles.

R1 (history: 2)
     R2 (no history)
     R3 (no history)

Actual Result

If I traverse recursive the complete history of a requests, the program breaks with recursive error. RecursionError: maximum recursion depth exceeded while calling a Python object The history contains a self-reference to the history owner.

The history of a request is a graph with a cycle.

R1 (history: 2)
     R2 (no history)
          R3 (history: 1)
               R3 (history: 1)
                    R3 (history: 1)
                         R3 (history: 1)
                              R3 (history: 1)
                                   ....
id=140537834271072 history=2
	index=0 id=140537834079136
	index=1 id=140537834080960
id=140537834079136 history=0
id=140537834080960 history=1
	index=0 id=140537834080960
id=140537834080960 history=1
	index=0 id=140537834080960
id=140537834080960 history=1
	index=0 id=140537834080960
id=140537834080960 history=1
	index=0 id=140537834080960

....


id=140537834080960 history=1
	index=0 id=140537834080960
id=140537834080960 history=1
	index=0 id=140537834080960
Traceback (most recent call last):
  File "/Users/andreas/PycharmProjects/cce/main.py", line 12, in <module>
    history(requests.get('https://coord.info/GC8T8E8'))
  File "/Users/andreas/PycharmProjects/cce/main.py", line 9, in history
    history(item)
  File "/Users/andreas/PycharmProjects/cce/main.py", line 9, in history
    history(item)
  File "/Users/andreas/PycharmProjects/cce/main.py", line 9, in history
    history(item)
  [Previous line repeated 993 more times]
  File "/Users/andreas/PycharmProjects/cce/main.py", line 5, in history
    print(f"id={id(r)} history={len(r.history)}")
RecursionError: maximum recursion depth exceeded while calling a Python object

Process finished with exit code 1

Reproduction Steps

import requests


def history(r):
    print(f"id={id(r)} history={len(r.history)}")
    for index, item in enumerate(r.history):
        print(f"\tindex={index} id={id(item)}")
    for index, item in enumerate(r.history):
        history(item)


history(requests.get('https://coord.info/GC8T8E8'))


System Information

{
  "chardet": {
    "version": null
  },
  "charset_normalizer": {
    "version": "2.1.1"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "3.4"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.9.2"
  },
  "platform": {
    "release": "21.6.0",
    "system": "Darwin"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.28.1"
  },
  "system_ssl": {
    "version": "1010109f"
  },
  "urllib3": {
    "version": "1.26.13"
  },
  "using_charset_normalizer": true,
  "using_pyopenssl": false
}

CachingFoX avatar Nov 26 '22 06:11 CachingFoX

We're aware, see https://github.com/psf/requests/issues/2690 which is documented in the README.

sethmlarson avatar Feb 10 '23 21:02 sethmlarson

I'm confused. I do not talk about the GIT history. I talked about the history of an request. Please, can you reopen my issue?

CachingFoX avatar Feb 11 '23 02:02 CachingFoX

Sorry about that!

sethmlarson avatar Feb 11 '23 21:02 sethmlarson

You don't need to traverse it recursively. Each response should have the entire history of the prior response.

In other words,

  • You get r3 back which has a history that looks like [r2, r1].
  • r2 has a history that looks like [r1], etc.

You only need to look at this history of the one you're seeing as the final response. It will contain the whole history.

sigmavirus24 avatar Feb 12 '23 13:02 sigmavirus24

So, this is a consequence of someone years ago arguing that each history response should have it's own history despite the linearity of the history and the fact that the top-level response has the entire context.

What's more helpful (in my view) for looking at this problem is:

>>> r = requests.get('https://httpbin.org/redirect/3')
>>> for resp in r.history:
...   print(id(resp), end='->'); print([(id(_), len(_.history)) for _ in resp.history])
... 
140058437748768->[]
140058437750976->[(140058437750976, 1)]
140058437751840->[(140058437750976, 1), (140058437751840, 2)]

Note that the 2nd response (which should have been the redirect request followed from the first item in the history) points to itself, not in fact to the first response.

While this is wrong, and can be fixed, it hasn't been noticed in years, because folks that care about the history are just looking at the top-level history (which contains everything) and don't actually need to look at each response's history.

I'd be in favor of just wiping those histories altogether instead of trying to make this look more correct. Either way, the place to fix this is in https://github.com/psf/requests/blob/15585909c3dd3014e4083961c8a404709450151c/requests/sessions.py#L182-L183

Namely, the first time through, we are called with the initial response that triggered a redirect, so we're doing

hist.append(resp)
resp.history = hist[1:]

This is correct because that 0th item in the final response's history has no other redirects.

We could also, write this two other ways:

resp.history = hist[:]
hist.append(resp)

This will create a new list based off of the existing history we know about, so that initial request will still be [], then the rest would look like:

... 140058437748768->[] 140058437750976->[(140058437748768, 0)] 140058437751840->[(140058437748768, 0), (140058437750976, 1)]


Alternatively, we could do

```py
hist.append(resp)
resp.history = hist[:-1]

Either way should make this work, but I think then, it has downstream affects on https://github.com/psf/requests/blob/15585909c3dd3014e4083961c8a404709450151c/requests/sessions.py#L190 which we can easily change to

if len(hist) >= self.max_redirects:

sigmavirus24 avatar Feb 18 '23 14:02 sigmavirus24

should I work on this?

rritik772 avatar Jul 04 '24 12:07 rritik772