requests-cache icon indicating copy to clipboard operation
requests-cache copied to clipboard

Segfault when accessing sqlite cache from many threads

Open danielhollas opened this issue 1 year ago • 9 comments

The problem

Our application started crashing with a segfault ( :scream: ) after we upgraded requests_cache from 0.9.8 to newer version (>=1.0.0). Our application uses the "patching" approach (requests_cache.install_cache). The particular crash seems to be caused by accessing the cache from multiple threads (see reproducer below).

I've bisected the issue to this commit: https://github.com/requests-cache/requests-cache/commit/c83e4e523008337a6f8f638e65ca217177d8ed5c It's commit message is telling

Share SQLite connection objects among threads and use lock for write operations instead of using thread-local connections

Running python with gdb gave us a clue.

  File "/home/hollas/atmospec/aiidalab-home/segfault.py", line 3, in <module>
    AiidaLabAppStore()
  File "/home/hollas/atmospec/aiidalab-home/home/app_store.py", line 76, in __init__
    AiidaLabApp(app_id, None, None) for app_id in self.index["apps"]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/aiidalab/app.py", line 754, in __init__
[Detaching after vfork from child process 865150]
    self._app = _AiidaLabApp.from_id(
                ^^^^^^^^^^^^^^^^^^^^^
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/aiidalab/app.py", line 130, in from_id
    remote_registry_entry = load_app_registry_entry(app_id)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/aiidalab/utils.py", line 70, in load_app_registry_entry
    return requests.get(f"{AIIDALAB_REGISTRY}/apps/{app_id}.json").json()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests/api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests/api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 183, in request
    return super().request(method, url, *args, headers=headers, **kwargs)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 218, in send
    cached_response = self.cache.get_response(actions.cache_key)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests_cache/backends/base.py", line 77, in get_response
    response = self.responses.get(key)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen _collections_abc>", line 807, in get
  File "/home/hollas/atmospec/aiidalab-home/.venv/lib64/python3.12/site-packages/requests_cache/backends/sqlite.py", line 313, in __getitem__
    row = cur.fetchone()
          ^^^^^^^^^^^^^^
sqlite3.ProgrammingError: Cannot operate on a closed database.

Other similar stack traces are sometimes produces as documented in https://github.com/aiidalab/aiidalab/issues/446 where I did the original investigation.

Expected behavior

The cache should be threadsafe and not lead to a crash.

Steps to reproduce the behavior

Minimal reproducer

from threading import Thread

import requests
import requests_cache

requests_cache.install_cache(
    cache_name="segfault_reproducer",
    use_cache_dir=True,  # store cache in ~/.cache/
    backend="sqlite",
    expire_after=60,  # seconds
)

def load_app_registry_index():
    try:
        return requests.get("https://aiidalab.github.io/aiidalab-registry/api/v1/apps_index.json").json()
    except (ValueError, requests.ConnectionError) as error:
        raise RuntimeError("Unable to load registry index") from error

num_threads = 50
threads = []
for _ in range(num_threads):
    t = Thread(target=load_app_registry_index)
    t.start()
    threads.append(t)

Workarounds

Using the CachedSession approach seems to fix the issue (as does using the memory backend instead of sqlite).

Environment

  • requests-cache version: reproduced in since 1.0.0 and latest main
  • Python version: 3.9, 3.12
  • Platform: Linux (Fedora 39 and Ubuntu 20.04)

danielhollas avatar Aug 12 '24 14:08 danielhollas

Thanks for the detailed report and reproducible example! Unfortunately, install_cache() is just not compatible with multithreading, and there are warnings about it most places it's mentioned in the docs. Are you able to use CachedSession in your application instead?

The main problems are:

  • Race condition with patching/unpatching requests.Session
  • The static requests API (requests.get(), etc.) internally creates a new session object for every request. That doesn't behave well when attching a cache to the session, since it's constantly opening and closing connections to the storage backend.
  • The Cannot operate on a closed database error in your traceback is one of the errors you can expect to see in this case. database is locked is another one.
  • Even if it doesn't raise an exception, patching + multithreading is likely to fail in other ways, like very low cache hit rate. While the specific points of failure have changed, that's the case for all versions of requests-cache going back to the first release!

Related: #870, #813, #514, #213, #135

JWCook avatar Aug 12 '24 16:08 JWCook

Hi,

thanks for taking a look! I think if that's the case the docs could make this point stronger, specifically

nd there are warnings

To quote:

There are some scenarios where patching requests with install_cache() is not ideal:

"Not ideal" to me does not equal segfault. :-)

places it's mentioned in the docs.

Here the docs only warn that "These functions are not thread-safe. " But that to me implies that the reproducer is valid since requests_cache.patcher.install_cache is not called from multiple threads, only the subseqent request.get calls are.

Are you able to use CachedSession in your application instead?

Yep, no problem, we were aiming to do that anyway.

danielhollas avatar Aug 12 '24 18:08 danielhollas

That's fair. I'll try to clarify that in the docs some more.

does not equal segfault

Your traceback looks like a regular python exception from the sqlite3 standard library module. A segmentation fault would be invalid memory access in the underlying SQLite C library, which is much less fun to debug!

JWCook avatar Aug 12 '24 19:08 JWCook

Your traceback looks like a regular python exception from the sqlite3 standard library module. A segmentation fault would be invalid memory access in the underlying SQLite C library, which is much less fun to debug!

That's the thing though, most of the time I am getting an actual segfault, please try to run the reproducer yourself several time.

image

(not saying this is necesarrily problem with this library, discovering segfault in stdlib would certainly be more exciting!).

danielhollas avatar Aug 12 '24 20:08 danielhollas

Hmm, I ran it few more times in the debugger and manage to catch the segfault, looks like a null pointer dereference

image

Not sure if this is enough to report to cpython

danielhollas avatar Aug 12 '24 20:08 danielhollas

Ah! You're right, now I see it. Looks like that's happening in sqlite3_last_insert_rowid().

Full trace:
#0  0x00007ffff4a68824 in sqlite3_last_insert_rowid () from /lib/x86_64-linux-gnu/libsqlite3.so.0
#1  0x00007ffff4b44bcf in _pysqlite_query_execute (self=self@entry=0x7ffff49b2a40, multiple=multiple@entry=0, operation=operation@entry=0x7ffff4c3af10, second_argument=<optimized out>, second_argument@entry=0x7ffff4924e20)
at /tmp/python-build.20221123142401.1913/Python-3.11.0/Modules/_sqlite/cursor.c:968
#2  0x00007ffff4b3dde0 in pysqlite_connection_execute_impl (parameters=0x7ffff4924e20, sql=<optimized out>, self=0x7ffff4c17790) at /tmp/python-build.20221123142401.1913/Python-3.11.0/Modules/_sqlite/connection.c:1679
#3  pysqlite_connection_execute (self=0x7ffff4c17790, args=0x7ffff40477b0, nargs=<optimized out>) at /tmp/python-build.20221123142401.1913/Python-3.11.0/Modules/_sqlite/clinic/connection.c.h:709
#4  0x0000555555647fb6 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5326
#5  0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047728, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#6  _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#7  0x0000555555726f9c in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=<optimized out>, callable=0x7ffff4c4a3e0, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#8  vectorcall_unbound (nargs=2, args=<optimized out>, func=0x7ffff4c4a3e0, unbound=<optimized out>, tstate=0x5555561672f0) at Objects/typeobject.c:1650
#9  vectorcall_method (nargs=2, args=0x7fff4e7fb050, name=0x555555ac6b08 <_PyRuntime+26024>) at Objects/typeobject.c:1681
#10 slot_mp_subscript (self=<optimized out>, arg1=<optimized out>) at Objects/typeobject.c:7402
#11 0x000055555564a8e1 in _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:2144
#12 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047550, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#13 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#14 0x00005555556af1e8 in _PyObject_VectorcallTstate (kwnames=0x7ffff4981900, nargsf=2, args=0x7fffec7e9c30, callable=0x7ffff48d84a0, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#15 method_vectorcall (method=<optimized out>, args=0x7fffec7e9c38, nargsf=<optimized out>, kwnames=0x7ffff4981900) at Objects/classobject.c:59
#16 0x00005555556ac0f1 in PyObject_Call () at Objects/call.c:349
#17 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7fffec7e9b80, callargs=0x7ffff49252d0, func=0x7fffec7e96c0, tstate=<optimized out>) at Python/ceval.c:7350
#18 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#19 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff40473f8, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#20 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#21 0x00005555556af1e8 in _PyObject_VectorcallTstate (kwnames=0x7fffec7e8180, nargsf=3, args=0x7fffec74fe10, callable=0x7ffff68c6d40, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#22 method_vectorcall (method=<optimized out>, args=0x7fffec74fe18, nargsf=<optimized out>, kwnames=0x7fffec7e8180) at Objects/classobject.c:59
#23 0x00005555556ac0f1 in PyObject_Call () at Objects/call.c:349
#24 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7ffff4940b00, callargs=0x7fffec7dfdc0, func=0x7fffec7e95c0, tstate=<optimized out>) at Python/ceval.c:7350
#25 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#26 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047320, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#27 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#28 0x00005555556af1e8 in _PyObject_VectorcallTstate (kwnames=0x7ffff49a6c80, nargsf=1, args=0x7fffec7c2450, callable=0x7ffff48d8400, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#29 method_vectorcall (method=<optimized out>, args=0x7fffec7c2458, nargsf=<optimized out>, kwnames=0x7ffff49a6c80) at Objects/classobject.c:59
#30 0x00005555556ac0f1 in PyObject_Call () at Objects/call.c:349
#31 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7fffec772fc0, callargs=0x555555aceb78 <_PyRuntime+58904>, func=0x7fffec7e89c0, tstate=<optimized out>) at Python/ceval.c:7350
#32 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#33 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047280, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#34 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#35 0x00005555556ac0f1 in PyObject_Call () at Objects/call.c:349
#36 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7ffff49a7080, callargs=0x7fffec7e8ac0, func=0x7ffff6881940, tstate=<optimized out>) at Python/ceval.c:7350
#37 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#38 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047188, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#39 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#40 0x0000555555648199 in do_call_core (use_tracing=<optimized out>, kwdict=0x7ffff49cf2c0, callargs=0x555555aceb78 <_PyRuntime+58904>, func=0x7ffff48d9580, tstate=<optimized out>) at Python/ceval.c:7350
#41 _PyEval_EvalFrameDefault (tstate=<optimized out>, frame=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:5377
#42 0x00005555557b55e1 in _PyEval_EvalFrame (throwflag=0, frame=0x7ffff4047020, tstate=0x5555561672f0) at ./Include/internal/pycore_ceval.h:73
#43 _PyEval_Vector (tstate=0x5555561672f0, func=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>) at Python/ceval.c:6428
#44 0x00005555556af284 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=1, args=0x7fff4e7fbda8, callable=0x7ffff7808860, tstate=0x5555561672f0) at ./Include/internal/pycore_call.h:92
#45 method_vectorcall (method=<optimized out>, args=0x555555aceb90 <_PyRuntime+58928>, nargsf=<optimized out>, kwnames=0x0) at Objects/classobject.c:67
#46 0x0000555555887a77 in thread_run (boot_raw=0x7fffec74fde0) at ./Modules/_threadmodule.c:1082
#47 0x0000555555816c9b in pythread_wrapper (arg=<optimized out>) at Python/thread_pthread.h:241
#48 0x00007ffff7d32ac3 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#49 0x00007ffff7dc4850 in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

That's certainly a new and exciting way for things to break. There's likely not much I can do about that particular case, but if you run into similar issues with multithreading + CachedSession, let me know and I can dig into it some more.

JWCook avatar Aug 12 '24 21:08 JWCook

Attempting to use CachedSession from ThreadPoolExecutor (~360 concurrent requests, but only 120 parallel ones due to the number of logical processors on this machine), I also hit multi-threading problems (where pinning to 0.9.8 made the failures go away).

The first symptom encountered was the following error printed to stdout without triggering an exception:

Unable to deserialize response: a bytes-like object is required, not 'NoneType'

The subsequent terminal failures were then of the form:

Traceback (most recent call last):
  File "/home/acoghlan/devel/free-threaded-wheels/generate.py", line 28, in <module>
    main(args.number)
  File "/home/acoghlan/devel/free-threaded-wheels/generate.py", line 12, in main
    packages = get_annotated_packages(get_top_packages(), to_chart)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/acoghlan/devel/free-threaded-wheels/utils.py", line 118, in get_annotated_packages
    package = scan_future.result()
              ^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/usr/lib64/python3.12/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/lib64/python3.12/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/acoghlan/devel/free-threaded-wheels/utils.py", line 97, in scan_package
    annotate_package(package)
  File "/home/acoghlan/devel/free-threaded-wheels/utils.py", line 36, in annotate_package
    response = SESSION.get(url)
               ^^^^^^^^^^^^^^^^
  File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 127, in get
    return self.request('GET', url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 183, in request
    return super().request(method, url, *args, headers=headers, **kwargs)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/session.py", line 218, in send
    cached_response = self.cache.get_response(actions.cache_key)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/backends/base.py", line 77, in get_response
    response = self.responses.get(key)
               ^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen _collections_abc>", line 807, in get
  File "/home/acoghlan/devel/free-threaded-wheels/.venv/lib64/python3.12/site-packages/requests_cache/backends/sqlite.py", line 312, in __getitem__
    cur = con.execute(f'SELECT value FROM {self.table_name} WHERE key=?', (key,))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
sqlite3.InterfaceError: bad parameter or other API misuse

While we ended up not using it for other reasons, https://github.com/hugovk/free-threaded-wheels/pull/16/ has the code where I ran into this. The relevant snippet:

SESSION = requests_cache.CachedSession("requests-cache", expire_after=60 * 60)

def annotate_package(package):
    # Uses SESSION.get to query PyPI for more info about the package
    ...

def scan_package(package):
    annotate_package(package)
    return package

def get_annotated_packages(packages, limit):
    annotated_packages = []
    packages = list(omit_deprecated(packages))
    with ThreadPoolExecutor() as pool:
        submitted_scans = []
        # Need to scan at least "limit" packages
        for package in packages[:limit]:
            submitted_scans.append(pool.submit(scan_package, package))

        for index, scan_future in enumerate(submitted_scans):
            package = scan_future.result()
            ... package processing code isn't relevant here ...

(Disabling the time based expiry didn't make any difference to the failures)

ncoghlan avatar Mar 19 '25 11:03 ncoghlan

Thanks for the details.

Unable to deserialize response: a bytes-like object is required, not 'NoneType'

In most cases this is something you can ignore. This usually happens after upgrading (or downgrading) requests-cache. If the serialization format changes, or if a previously cached response otherwise can't be deserialized, it will log the error and just fetch a new response. However, if this happened with a new cache and only with concurrency, that could be a different issue.

sqlite3.InterfaceError: bad parameter or other API misuse

This is a bug that popped up specifically with python 3.12 and concurrent usage of the SQLite backend: #845. I've made a few attempts at debugging it, but so far his one has me stumped! I will take another look at this either this weekend or next week, but any help would be greatly appreciated.

JWCook avatar Mar 20 '25 17:03 JWCook

Thank you for pointing out that version 0.9.8 works. I have downgraded to this version in order to have my flask-based gunicorn app working.

fredericoschardong avatar May 01 '25 20:05 fredericoschardong

I can confirm that this problem still exists with requests-cache 1.2.1. I see it in an application that makes a ton of rapid fire API calls and goes through a simple thread pool and a requests-cache session (no patching):

uv run main.py app does stuff, some requests succeed, then no output for a while and heavy disk activity

  File "main.py", line 199, in <module>
    main()
    ~~~~^^
  File "main.py", line 118, in main
    detail = fut.result()
  File "/usr/lib/python3.13/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ~~~~~~~~~~~~~~~~~^^
  File "/usr/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/usr/lib/python3.13/concurrent/futures/thread.py", line 59, in run
    result = self.fn(*self.args, **self.kwargs)
  File "main.py", line 77, in __call__
    ret = self.session.get(API + endpoint)
  File ".venv/lib/python3.13/site-packages/requests_cache/session.py", line 127, in get
    return self.request('GET', url, params=params, **kwargs)
           ~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.13/site-packages/requests_cache/session.py", line 183, in request
    return super().request(method, url, *args, headers=headers, **kwargs)  # type: ignore
           ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/lib/python3.13/site-packages/requests/sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
  File ".venv/lib/python3.13/site-packages/requests_cache/session.py", line 218, in send
    cached_response = self.cache.get_response(actions.cache_key)
  File ".venv/lib/python3.13/site-packages/requests_cache/backends/base.py", line 79, in get_response
    response = self.responses[self.redirects[key]]
                              ~~~~~~~~~~~~~~^^^^^
  File ".venv/lib/python3.13/site-packages/requests_cache/backends/sqlite.py", line 312, in __getitem__
    cur = con.execute(f'SELECT value FROM {self.table_name} WHERE key=?', (key,))
sqlite3.InterfaceError: bad parameter or other API misuse

This is with these versions:

Resolved 12 packages in 1ms
my project v0.1.0
└── requests-cache v1.2.1
    ├── attrs v25.3.0
    ├── cattrs v25.2.0
    │   ├── attrs v25.3.0
    │   └── typing-extensions v4.15.0
    ├── platformdirs v4.4.0
    ├── requests v2.32.5
    │   ├── certifi v2025.8.3
    │   ├── charset-normalizer v3.4.3
    │   ├── idna v3.10
    │   └── urllib3 v2.5.0
    ├── url-normalize v2.2.1
    │   └── idna v3.10
    └── urllib3 v2.5.0

and if I downgrade to 0.9.8 it works:

Resolved 12 packages in 600ms
Uninstalled 2 packages in 3ms
Installed 1 package in 4ms
 - platformdirs==4.4.0
 - requests-cache==1.2.1
 + requests-cache==0.9.8

uv run main.py
[... smooth scrolling status text...]
Output written to outfile.sqlite.

I structured the code mainly along your ThreadPool example because the local processing was fast enough and I only needed to speed up the somewhat slow api calls. So all the futures get consumed in a single thread sequentially, just like in your example. The relevant code:

class AssistantApi:
    def __init__(self):
        self.session = CachedSession(
            cache_name="AssistantNMS",
            expire_after=timedelta(days=1),
            allowable_codes=[200, 400],
            stale_if_error=True,
        )
        self.session.get_adapter(API).poolmanager.connection_pool_kw["maxsize"]=threadcount()

    def __call__(self, endpoint):
        ret = self.session.get(API + endpoint)
        ret.raise_for_status()
        return ret.json()

# [...]

    print("====== Retrieving item information ======")
    items = api("ItemInfo/GameId") # this is the Api class above
    with ThreadPoolExecutor(max_workers=threadcount()) as executor:
        fut_arg_map = {executor.submit(api, f"ItemInfo/GameId/{item}/en"): item for item in items}

        with db.con:
            db.con.execute("BEGIN")
            for i, fut in enumerate(as_completed(fut_arg_map)):
                item = fut_arg_map[fut]

threadcount() is 28 on my machine, so nothing too wild.

dtrauma avatar Sep 26 '25 00:09 dtrauma

sqlite3.InterfaceError: bad parameter or other API misuse

I believe this issue has been fixed in main. While the root cause is still unknown, it was related to changes to the sqlite3 module in cpython 3.12. Try testing with the latest pre-release build (requests-cache==1.3.0a1) and let me know if that works for you.

JWCook avatar Sep 27 '25 16:09 JWCook

Indeed, 1.3.0a1 works for me :)

dtrauma avatar Sep 27 '25 17:09 dtrauma