cachecontrol
cachecontrol copied to clipboard
no way to set msgpack max_bin_len limits use of cache to small files
When trying to use cachecontrol with very large files (disk images in the case I'm considering), there's no easy way to pass a max_bin_len to msgpack.loads to say "yeah, I really do want to be able to load huge files".
cachecontrol will write the huge files, but then when it comes round to read them, msgpack will produce a ValueError and cachecontrol will return None to the deserialization routines.
It appears that the way to hack around it would be to subclass Serializer and replace loads_v4 to give some args to msgpack.loads.
Is there a better way? Is this something that you'd be interested in seeing as a kwarg passed down from CacheControl?
It appears that this got fixed somehow.
bob $ pip3 freeze | egrep -i 'requests|msgpack|cache'
CacheControl==0.12.6
msgpack==1.0.2
requests==2.25.1
alice $ (
printf 'HTTP/1.0 200 OK\n'
printf 'Date: '; LC_ALL=C date -u '+%a, %d %b %Y %X %Z'
printf 'Content-Length: 500000000\n'
printf 'Cache-Control: max-age=6000\n\n'
yes | dd iflag=count_bytes count=500MB
) | nc -l 8000
bob $ python3 -c '
import requests
import cachecontrol.caches
s = requests.session()
c = cachecontrol.caches.FileCache("./cache")
a = cachecontrol.CacheControlAdapter(c)
s.mount("http://", a)
print(len(s.get("http://localhost:8000/foo.txt").content))
'
500000000
bob $ python3 -c '
import requests
import cachecontrol.caches
s = requests.session()
c = cachecontrol.caches.FileCache("./cache")
a = cachecontrol.CacheControlAdapter(c)
s.mount("http://", a)
print(len(s.get("http://localhost:8000/foo.txt").content))
'
500000000
The second request is definitely served from cache because nc stops listening after the first client disconnects.
500000000 is enough to exceed the default max_bin_len:
bob $ MSGPACK_PUREPYTHON=1 python -c '
import msgpack, sys
with open(sys.argv[1], "rb") as f:
f.read(5)
u = msgpack.Unpacker(f)
u.unpack()
' ./cache/5/c/a/8/b/5ca8b7d8184924c60c5c454a874bf5ed7b4741d0660cb7d295185d63
Traceback (most recent call last):
File "<string>", line 6, in <module>
File "/path/to/python3.9/site-packages/msgpack/fallback.py", line 723, in unpack
ret = self._unpack(EX_CONSTRUCT)
File "/path/to/python3.9/site-packages/msgpack/fallback.py", line 671, in _unpack
ret[key] = self._unpack(EX_CONSTRUCT)
File "/path/to/python3.9/site-packages/msgpack/fallback.py", line 671, in _unpack
ret[key] = self._unpack(EX_CONSTRUCT)
File "/path/to/python3.9/site-packages/msgpack/fallback.py", line 625, in _unpack
typ, n, obj = self._read_header(execute)
File "/path/to/python3.9/site-packages/msgpack/fallback.py", line 467, in _read_header
raise ValueError("%s exceeds max_bin_len(%s)" % (n, self._max_bin_len))
ValueError: 500000000 exceeds max_bin_len(104857600)
I've tried reproducing a variant of this as part of #336, but failed to. I'm going to close thisn out and track any follow-ups there. Thanks all!
(If anybody has a reproducer for this, it would be greatly appreciated.)