requests-cache
requests-cache copied to clipboard
Segfault when accessing sqlite cache from many threads
The problem
Our application started crashing with a segfault ( :scream: ) after we upgraded requests_cache from 0.9.8 to newer version (>=1.0.0). Our application uses the "patching" approach (requests_cache.install_cache). The particular crash seems to be caused by accessing the cache from multiple threads (see reproducer below).
I've bisected the issue to this commit: https://github.com/requests-cache/requests-cache/commit/c83e4e523008337a6f8f638e65ca217177d8ed5c It's commit message is telling
Share SQLite connection objects among threads and use lock for write operations instead of using thread-local connections
Running python with gdb gave us a clue.
File "/home/hollas/atmospec/aiidalab-home/segfault.py", line 3, in <module>
AiidaLabAppStore()
File "/home/hollas/atmospec/aiidalab-home/home/app_store.py", line 76, in __init__
AiidaLabApp(app_id, None, None) for app_id in self.index["apps"]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/aiidalab/app.py", line 754, in __init__
[Detaching after vfork from child process 865150]
self._app = _AiidaLabApp.from_id(
^^^^^^^^^^^^^^^^^^^^^
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/aiidalab/app.py", line 130, in from_id
remote_registry_entry = load_app_registry_entry(app_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/aiidalab/utils.py", line 70, in load_app_registry_entry
return requests.get(f"{AIIDALAB_REGISTRY}/apps/{app_id}.json").json()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 183, in request
return super().request(method, url, *args, headers=headers, **kwargs) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 218, in send
cached_response = self.cache.get_response(actions.cache_key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests_cache/backends/base.py", line 77, in get_response
response = self.responses.get(key)
^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen _collections_abc>", line 807, in get
File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests_cache/backends/sqlite.py", line 313, in __getitem__
row = cur.fetchone()
^^^^^^^^^^^^^^
sqlite3.ProgrammingError: Cannot operate on a closed database.
Other similar stack traces are sometimes produces as documented in https://github.com/aiidalab/aiidalab/issues/446 where I did the original investigation.
Expected behavior
The cache should be threadsafe and not lead to a crash.
Steps to reproduce the behavior
Minimal reproducer
from threading import Thread
import requests
import requests_cache
requests_cache.install_cache(
cache_name="segfault_reproducer",
use_cache_dir=True, # store cache in ~/.cache/
backend="sqlite",
expire_after=60, # seconds
)
def load_app_registry_index():
try:
return requests.get("https://aiidalab.github.io/aiidalab-registry/api/v1/apps_index.json").json()
except (ValueError, requests.ConnectionError) as error:
raise RuntimeError("Unable to load registry index") from error
num_threads = 50
threads = []
for _ in range(num_threads):
t = Thread(target=load_app_registry_index)
t.start()
threads.append(t)
Workarounds
Using the CachedSession approach seems to fix the issue (as does using the memory backend instead of sqlite).
Environment
- requests-cache version: reproduced in since 1.0.0 and latest main
- Python version:
3.9,3.12 - Platform: Linux (Fedora 39 and Ubuntu 20.04)
Thanks for the detailed report and reproducible example! Unfortunately, install_cache() is just not compatible with multithreading, and there are warnings about it most places it's mentioned in the docs. Are you able to use CachedSession in your application instead?
The main problems are:
- Race condition with patching/unpatching
requests.Session - The static
requestsAPI (requests.get(), etc.) internally creates a new session object for every request. That doesn't behave well when attching a cache to the session, since it's constantly opening and closing connections to the storage backend. - The
Cannot operate on a closed databaseerror in your traceback is one of the errors you can expect to see in this case.database is lockedis another one. - Even if it doesn't raise an exception, patching + multithreading is likely to fail in other ways, like very low cache hit rate. While the specific points of failure have changed, that's the case for all versions of requests-cache going back to the first release!
Related: #870, #813, #514, #213, #135
Hi,
thanks for taking a look! I think if that's the case the docs could make this point stronger, specifically
nd there are warnings
To quote:
There are some scenarios where patching requests with install_cache() is not ideal:
"Not ideal" to me does not equal segfault. :-)
places it's mentioned in the docs.
Here the docs only warn that "These functions are not thread-safe. " But that to me implies that the reproducer is valid since requests_cache.patcher.install_cache is not called from multiple threads, only the subseqent request.get calls are.
Are you able to use CachedSession in your application instead?
Yep, no problem, we were aiming to do that anyway.
That's fair. I'll try to clarify that in the docs some more.
does not equal segfault
Your traceback looks like a regular python exception from the sqlite3 standard library module. A segmentation fault would be invalid memory access in the underlying SQLite C library, which is much less fun to debug!
Your traceback looks like a regular python exception from the sqlite3 standard library module. A segmentation fault would be invalid memory access in the underlying SQLite C library, which is much less fun to debug!
That's the thing though, most of the time I am getting an actual segfault, please try to run the reproducer yourself several time.
(not saying this is necesarrily problem with this library, discovering segfault in stdlib would certainly be more exciting!).
Hmm, I ran it few more times in the debugger and manage to catch the segfault, looks like a null pointer dereference
Not sure if this is enough to report to cpython
Ah! You're right, now I see it. Looks like that's happening in sqlite3_last_insert_rowid().
Full trace:
#0 0x00007ffff4a68824 in sqlite3_last_insert_rowid () from /lib/x86_64-linux-gnu/libsqlite3.so.0
#1 0x00007ffff4b44bcf in _pysqlite_query_execute (self=self@entry=0x7ffff49b2a40, multiple=multiple@entry=0, operation=operation@entry=0x7ffff4c3af10, second_argument=<optimized out>, second_argument@entry=0x7ffff4924e20)
at /tmp/python-build.20221123142401.1913/Python-3.11.0/Modules/_sqlite/cursor.c:968
#2 0x00007ffff4b3dde0 in pysqlite_connection_execute_impl (parameters=0x7ffff4924e20, sql=<optimized out>, self=0x7ffff4c17790) at /tmp/python-build.20221123142401.1913/Python-3.11.0/Modules/_sqlite/connection.c:1679
#3 pysqlite_connection_execute (self=0x7ffff4c17790, args=0x7ffff40477b0, nargs=<optimized out>) at /tmp/python-build.20221123142401.1913/Python-3.11.0/Modules/_sqlite/clinic/connection.c.h:709
#4 0x0000555555647fb6 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5326
#5 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047728, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#6 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#7 0x0000555555726f9c in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=<optimized out>, callable=0x7ffff4c4a3e0, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#8 vectorcall_unbound (nargs=2, args=<optimized out>, func=0x7ffff4c4a3e0, unbound=<optimized out>, tstate=0x5555561672f0) at Objects/typeobject.c:1650
#9 vectorcall_method (nargs=2, args=0x7fff4e7fb050, name=0x555555ac6b08 <_PyRuntime+26024>) at Objects/typeobject.c:1681
#10 slot_mp_subscript (self=<optimized out>, arg1=<optimized out>) at Objects/typeobject.c:7402
#11 0x000055555564a8e1 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2144
#12 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047550, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#13 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#14 0x00005555556af1e8 in _PyObject_VectorcallTstate (kwnames=0x7ffff4981900, nargsf=2, args=0x7fffec7e9c30, callable=0x7ffff48d84a0, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#15 method_vectorcall (method=<optimized out>, args=0x7fffec7e9c38, nargsf=<optimized out>, kwnames=0x7ffff4981900) at Objects/classobject.c:59
#16 0x00005555556ac0f1 in PyObject_Call () at Objects/call.c:349
#17 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7fffec7e9b80, callargs=0x7ffff49252d0, func=0x7fffec7e96c0, tstate=<optimized out>) at Python/ceval.c:7350
#18 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#19 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff40473f8, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#20 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#21 0x00005555556af1e8 in _PyObject_VectorcallTstate (kwnames=0x7fffec7e8180, nargsf=3, args=0x7fffec74fe10, callable=0x7ffff68c6d40, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#22 method_vectorcall (method=<optimized out>, args=0x7fffec74fe18, nargsf=<optimized out>, kwnames=0x7fffec7e8180) at Objects/classobject.c:59
#23 0x00005555556ac0f1 in PyObject_Call () at Objects/call.c:349
#24 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7ffff4940b00, callargs=0x7fffec7dfdc0, func=0x7fffec7e95c0, tstate=<optimized out>) at Python/ceval.c:7350
#25 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#26 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047320, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#27 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#28 0x00005555556af1e8 in _PyObject_VectorcallTstate (kwnames=0x7ffff49a6c80, nargsf=1, args=0x7fffec7c2450, callable=0x7ffff48d8400, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#29 method_vectorcall (method=<optimized out>, args=0x7fffec7c2458, nargsf=<optimized out>, kwnames=0x7ffff49a6c80) at Objects/classobject.c:59
#30 0x00005555556ac0f1 in PyObject_Call () at Objects/call.c:349
#31 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7fffec772fc0, callargs=0x555555aceb78 <_PyRuntime+58904>, func=0x7fffec7e89c0, tstate=<optimized out>) at Python/ceval.c:7350
#32 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#33 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047280, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#34 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#35 0x00005555556ac0f1 in PyObject_Call () at Objects/call.c:349
#36 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7ffff49a7080, callargs=0x7fffec7e8ac0, func=0x7ffff6881940, tstate=<optimized out>) at Python/ceval.c:7350
#37 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#38 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047188, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#39 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#40 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7ffff49cf2c0, callargs=0x555555aceb78 <_PyRuntime+58904>, func=0x7ffff48d9580, tstate=<optimized out>) at Python/ceval.c:7350
#41 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#42 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047020, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#43 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#44 0x00005555556af284 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=1, args=0x7fff4e7fbda8, callable=0x7ffff7808860, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#45 method_vectorcall (method=<optimized out>, args=0x555555aceb90 <_PyRuntime+58928>, nargsf=<optimized out>, kwnames=0x0) at Objects/classobject.c:67
#46 0x0000555555887a77 in thread_run (boot_raw=0x7fffec74fde0) at ./Modules/_threadmodule.c:1082
#47 0x0000555555816c9b in pythread_wrapper (arg=<optimized out>) at Python/thread_pthread.h:241
#48 0x00007ffff7d32ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#49 0x00007ffff7dc4850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
That's certainly a new and exciting way for things to break. There's likely not much I can do about that particular case, but if you run into similar issues with multithreading + CachedSession, let me know and I can dig into it some more.
Attempting to use CachedSession from ThreadPoolExecutor (~360 concurrent requests, but only 120 parallel ones due to the number of logical processors on this machine), I also hit multi-threading problems (where pinning to 0.9.8 made the failures go away).
The first symptom encountered was the following error printed to stdout without triggering an exception:
Unable to deserialize response: a bytes-like object is required, not 'NoneType'
The subsequent terminal failures were then of the form:
Traceback (most recent call last):
File "/home/acoghlan/devel/free-threaded-wheels/generate.py", line 28, in <module>
main(args.number)
File "/home/acoghlan/devel/free-threaded-wheels/generate.py", line 12, in main
packages = get_annotated_packages(get_top_packages(), to_chart)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/acoghlan/devel/free-threaded-wheels/utils.py", line 118, in get_annotated_packages
package = scan_future.result()
^^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib64/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/acoghlan/devel/free-threaded-wheels/utils.py", line 97, in scan_package
annotate_package(package)
File "/home/acoghlan/devel/free-threaded-wheels/utils.py", line 36, in annotate_package
response = SESSION.get(url)
^^^^^^^^^^^^^^^^
File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 127, in get
return self.request('GET', url, params=params, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 183, in request
return super().request(method, url, *args, headers=headers, **kwargs) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 218, in send
cached_response = self.cache.get_response(actions.cache_key)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/backends/base.py", line 77, in get_response
response = self.responses.get(key)
^^^^^^^^^^^^^^^^^^^^^^^
File "<frozen _collections_abc>", line 807, in get
File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/backends/sqlite.py", line 312, in __getitem__
cur = con.execute(f'SELECT value FROM {self.table_name} WHERE key=?', (key,))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.InterfaceError: bad parameter or other API misuse
While we ended up not using it for other reasons, https://github.com/hugovk/free-threaded-wheels/pull/16/ has the code where I ran into this. The relevant snippet:
SESSION = requests_cache.CachedSession("requests-cache", expire_after=60 * 60)
def annotate_package(package):
# Uses SESSION.get to query PyPI for more info about the package
...
def scan_package(package):
annotate_package(package)
return package
def get_annotated_packages(packages, limit):
annotated_packages = []
packages = list(omit_deprecated(packages))
with ThreadPoolExecutor() as pool:
submitted_scans = []
# Need to scan at least "limit" packages
for package in packages[:limit]:
submitted_scans.append(pool.submit(scan_package, package))
for index, scan_future in enumerate(submitted_scans):
package = scan_future.result()
... package processing code isn't relevant here ...
(Disabling the time based expiry didn't make any difference to the failures)
Thanks for the details.
Unable to deserialize response: a bytes-like object is required, not 'NoneType'
In most cases this is something you can ignore. This usually happens after upgrading (or downgrading) requests-cache. If the serialization format changes, or if a previously cached response otherwise can't be deserialized, it will log the error and just fetch a new response. However, if this happened with a new cache and only with concurrency, that could be a different issue.
sqlite3.InterfaceError: bad parameter or other API misuse
This is a bug that popped up specifically with python 3.12 and concurrent usage of the SQLite backend: #845. I've made a few attempts at debugging it, but so far his one has me stumped! I will take another look at this either this weekend or next week, but any help would be greatly appreciated.
Thank you for pointing out that version 0.9.8 works. I have downgraded to this version in order to have my flask-based gunicorn app working.
I can confirm that this problem still exists with requests-cache 1.2.1. I see it in an application that makes a ton of rapid fire API calls and goes through a simple thread pool and a requests-cache session (no patching):
uv run main.py
app does stuff, some requests succeed, then no output for a while and heavy disk activity
File "main.py", line 199, in <module>
main()
~~~~^^
File "main.py", line 118, in main
detail = fut.result()
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
~~~~~~~~~~~~~~~~~^^
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.13/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
File "main.py", line 77, in __call__
ret = self.session.get(API + endpoint)
File ".venv/lib/python3.13/site-packages/requests_cache/session.py", line 127, in get
return self.request('GET', url, params=params, **kwargs)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.13/site-packages/requests_cache/session.py", line 183, in request
return super().request(method, url, *args, headers=headers, **kwargs) # type: ignore
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.13/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File ".venv/lib/python3.13/site-packages/requests_cache/session.py", line 218, in send
cached_response = self.cache.get_response(actions.cache_key)
File ".venv/lib/python3.13/site-packages/requests_cache/backends/base.py", line 79, in get_response
response = self.responses[self.redirects[key]]
~~~~~~~~~~~~~~^^^^^
File ".venv/lib/python3.13/site-packages/requests_cache/backends/sqlite.py", line 312, in __getitem__
cur = con.execute(f'SELECT value FROM {self.table_name} WHERE key=?', (key,))
sqlite3.InterfaceError: bad parameter or other API misuse
This is with these versions:
Resolved 12 packages in 1ms
my project v0.1.0
└── requests-cache v1.2.1
├── attrs v25.3.0
├── cattrs v25.2.0
│ ├── attrs v25.3.0
│ └── typing-extensions v4.15.0
├── platformdirs v4.4.0
├── requests v2.32.5
│ ├── certifi v2025.8.3
│ ├── charset-normalizer v3.4.3
│ ├── idna v3.10
│ └── urllib3 v2.5.0
├── url-normalize v2.2.1
│ └── idna v3.10
└── urllib3 v2.5.0
and if I downgrade to 0.9.8 it works:
Resolved 12 packages in 600ms
Uninstalled 2 packages in 3ms
Installed 1 package in 4ms
- platformdirs==4.4.0
- requests-cache==1.2.1
+ requests-cache==0.9.8
uv run main.py
[... smooth scrolling status text...]
Output written to outfile.sqlite.
I structured the code mainly along your ThreadPool example because the local processing was fast enough and I only needed to speed up the somewhat slow api calls. So all the futures get consumed in a single thread sequentially, just like in your example. The relevant code:
class AssistantApi:
def __init__(self):
self.session = CachedSession(
cache_name="AssistantNMS",
expire_after=timedelta(days=1),
allowable_codes=[200, 400],
stale_if_error=True,
)
self.session.get_adapter(API).poolmanager.connection_pool_kw["maxsize"]=threadcount()
def __call__(self, endpoint):
ret = self.session.get(API + endpoint)
ret.raise_for_status()
return ret.json()
# [...]
print("====== Retrieving item information ======")
items = api("ItemInfo/GameId") # this is the Api class above
with ThreadPoolExecutor(max_workers=threadcount()) as executor:
fut_arg_map = {executor.submit(api, f"ItemInfo/GameId/{item}/en"): item for item in items}
with db.con:
db.con.execute("BEGIN")
for i, fut in enumerate(as_completed(fut_arg_map)):
item = fut_arg_map[fut]
threadcount() is 28 on my machine, so nothing too wild.
sqlite3.InterfaceError: bad parameter or other API misuse
I believe this issue has been fixed in main. While the root cause is still unknown, it was related to changes to the sqlite3 module in cpython 3.12. Try testing with the latest pre-release build (requests-cache==1.3.0a1) and let me know if that works for you.
Indeed, 1.3.0a1 works for me :)