unable to read duckdb file
Describe the bug with cache_httpfs loaded, duckdb unable to read a duckdb file from http.
To Reproduce
FORCE INSTALL cache_httpfs FROM community;
LOAD cache_httpfs
ATTACH 's3://duckdb-blobs/databases/stations.duckdb' AS stations_db;
Expected behavior
we should get an attached database called stations_db
Screenshots
n/a
Desktop (please complete the following information):
- OS: osx arm64 / linux amd64
- duckdb: 1.2.2
Thank you @obarisk for the report! Do you mind also pasting the error message to the issue description if it isn't too hard? I will take a look later.
on linux
INTERNAL Error:
Attempted to dereference unique_ptr that is NULL!
Stack Trace:
/home/obarisk/.duckdb/extensions/v1.2.2/linux_amd64_gcc4/cache_httpfs.duckdb_extension(+0x58e0e3) [0x7fb0b478e0e3]
/home/obarisk/.duckdb/extensions/v1.2.2/linux_amd64_gcc4/cache_httpfs.duckdb_extension(+0x58e116) [0x7fb0b478e116]
/home/obarisk/.duckdb/extensions/v1.2.2/linux_amd64_gcc4/cache_httpfs.duckdb_extension(+0x58fdb1) [0x7fb0b478fdb1]
/home/obarisk/.duckdb/extensions/v1.2.2/linux_amd64_gcc4/cache_httpfs.duckdb_extension(+0x35033e) [0x7fb0b455033e]
/home/obarisk/.duckdb/extensions/v1.2.2/linux_amd64_gcc4/cache_httpfs.duckdb_extension(+0x35656f) [0x7fb0b455656f]
/home/obarisk/.duckdb/extensions/v1.2.2/linux_amd64_gcc4/cache_httpfs.duckdb_extension(+0x359b41) [0x7fb0b4559b41]
/home/obarisk/.duckdb/extensions/v1.2.2/linux_amd64_gcc4/cache_httpfs.duckdb_extension(+0x359d58) [0x7fb0b4559d58]
duckdb() [0x92fc74]
duckdb() [0xb82146]
duckdb() [0xd31f63]
duckdb() [0xd326a4]
duckdb() [0x14acb85]
duckdb() [0xbb8914]
duckdb() [0xbc3dbb]
duckdb() [0xbc40c8]
duckdb() [0xbba6be]
duckdb() [0xbbde84]
duckdb() [0xb78463]
duckdb() [0xb786b8]
duckdb() [0xb78845]
duckdb() [0x73dfbf]
duckdb() [0x72305d]
duckdb() [0x723606]
duckdb() [0x72411c]
duckdb() [0x724797]
duckdb() [0x716a13]
/lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x7fb0f4743ca8]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) [0x7fb0f4743d65]
duckdb() [0x71a2d7]
This error signals an assertion failure within DuckDB. This usually occurs due to unexpected conditions or errors in the program's logic.
For more information, see https://duckdb.org/docs/dev/internal_errors
on osx. it's more strange.
we need
load httpfs;
load cache_httpfs;
ATTACH 's3://duckdb-blobs/databases/stations.duckdb' AS stations_db;
to get the following error
INTERNAL Error:
Attempted to dereference unique_ptr that is NULL!
Stack Trace:
0 _ZN6duckdb9ExceptionC2ENS_13ExceptionTypeERKNSt3__112basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEE + 64
1 _ZN6duckdb17InternalExceptionC1ERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE + 20
2 _ZNK6duckdb10unique_ptrINS_10FileHandleENSt3__114default_deleteIS1_EELb1EEptEv + 132
3 _ZN6duckdb21CacheFileSystemHandleC2ENS_10unique_ptrINS_10FileHandleENSt3__114default_deleteIS2_EELb1EEERNS_15CacheFileSystemE + 44
4 _ZN6duckdb15CacheFileSystem28GetOrCreateFileHandleForReadERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEENS_13FileOpenFlagsENS_12optional_ptrINS_10FileOpenerELb1EEE + 668
5 duckdb::VirtualFileSystem::OpenFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&, duckdb::FileOpenFlags, duckdb::optional_ptr<duckdb::FileOpener, true>) + 452
6 duckdb::WriteAheadLog::Replay(duckdb::FileSystem&, duckdb::AttachedDatabase&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char>> const&) + 72
7 duckdb::SingleFileStorageManager::LoadDatabase(duckdb::StorageOptions) + 636
8 duckdb::StorageManager::Initialize(duckdb::StorageOptions) + 80
9 duckdb::AttachedDatabase::Initialize(duckdb::StorageOptions) + 104
10 duckdb::PhysicalAttach::GetData(duckdb::ExecutionContext&, duckdb::DataChunk&, duckdb::OperatorSourceInput&) const + 600
11 duckdb::PipelineExecutor::FetchFromSource(duckdb::DataChunk&) + 124
12 duckdb::PipelineExecutor::Execute(unsigned long long) + 236
13 duckdb::PipelineTask::ExecuteTask(duckdb::TaskExecutionMode) + 236
14 duckdb::ExecutorTask::Execute(duckdb::TaskExecutionMode) + 192
15 duckdb::Executor::ExecuteTask(bool) + 252
16 duckdb::ClientContext::ExecuteTaskInternal(duckdb::ClientContextLock&, duckdb::BaseQueryResult&, bool) + 64
17 duckdb::PendingQueryResult::ExecuteInternal(duckdb::ClientContextLock&) + 60
18 duckdb::PendingQueryResult::Execute() + 56
19 duckdb_shell_sqlite3_print_duckbox + 368
20 duckdb_shell::ShellState::ExecutePreparedStatement(sqlite3_stmt*) + 932
21 duckdb_shell::ShellState::ExecuteSQL(char const*, char**) + 452
22 duckdb_shell::ShellState::RunOneSqlLine(char*) + 104
23 duckdb_shell::ShellState::ProcessInput() + 916
24 main + 3140
25 start + 6000
This error signals an assertion failure within DuckDB. This usually occurs due to unexpected conditions or errors in the program's logic.
For more information, see https://duckdb.org/docs/dev/internal_errors
Hi @obarisk , unfortunately I cannot reproduce the nullptr dereference issue; let's solve the other attachment issue first :)
If I attach to my database (remote duckdb file on S3) before load cache_httpfs, I'm able to make some progress, however I soon get a segfault:
Fatal Python error: Segmentation fault
Thread 0x00000001f4c04800 (most recent call first):
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/ducklake.py", line 635 in test_s3_ducklake_metadata
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/python.py", line 157 in pytest_pyfunc_call
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/python.py", line 1671 in runtest
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 178 in pytest_runtest_call
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 246 in <lambda>
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 344 in from_call
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 245 in call_and_report
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 136 in runtestprotocol
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/runner.py", line 117 in pytest_runtest_protocol
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/main.py", line 367 in pytest_runtestloop
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/main.py", line 343 in _main
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/main.py", line 289 in wrap_session
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/main.py", line 336 in pytest_cmdline_main
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_callers.py", line 121 in _multicall
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_manager.py", line 120 in _hookexec
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/pluggy/_hooks.py", line 512 in __call__
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/config/__init__.py", line 175 in main
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/lib/python3.11/site-packages/_pytest/config/__init__.py", line 201 in console_main
File "/Users/erik/DropboxMaestral/home/git/duckdb-test/.venv/bin/pytest", line 10 in <module>
Extension modules: psutil._psutil_osx, psutil._psutil_posix, charset_normalizer.md, google._upb._message (total: 4)
MacOs 26.0.1 Duckdb 1.4.1
Hi @erikcw thanks for the report!
- I'm wondering if you could file a separate issue? It's different with the initial one
- Could you please also provide a (minimal) script to repro? That would be beneficial for me to debug, thank you!
@dentiny
Hello, I encountered the same problem that cache_httpfs causes crash when ATTACH-ing remote .duckdb which is approximately 3.8G and has mutiple tables over S3 (unique_ptr NULL) after LOAD cache_httpfs.
Summary
When cache_httpfs is loaded (on-disk mode), ATTACH-ing a remote single-file DuckDB database over S3 crashes with:
INTERNAL Error: Attempted to dereference unique_ptr that is NULL!
This happens at ATTACH-time (before any SELECT), only if we ATTACH s3://…/*.duckdb after LOAD cache_httpfs.
If we (a) ATTACH a local path, or (b) DO NOT load cache_httpfs, the same code works.
Minimal Repro (Python)
import duckdb
con = duckdb.connect(database=":memory:")
con.execute("INSTALL httpfs; LOAD httpfs;")
# crash only when cache_httpfs is loaded before ATTACH
con.execute("INSTALL cache_httpfs FROM community; LOAD cache_httpfs;")
# on-disk cache config
con.execute("SET cache_httpfs_type='on_disk';")
con.execute("SET cache_httpfs_cache_directory='/tmp/duck_cache';")
# S3 auth (works, HEAD returns 200)
con.execute("SET s3_region='ap-northeast-1';")
con.execute("SET s3_url_style='path';") # also tried 'vhost'
con.execute("SET s3_use_ssl=true;")
con.execute("SET enable_http_logging=true;")
# Repro: ATTACH a .duckdb file on S3 (HEAD 200, GET range returns 206)
con.execute("ATTACH 's3://<bucket>/<prefix>/2025/10/xxxx_2025-10-15.duckdb' AS d20251015 (READ_ONLY)")
Expected
ATTACH succeeds; later queries fetch needed blocks via HTTP range requests and (optionally) cache on disk.
Actual
Crash with internal error from cache_httpfs extension:
INTERNAL Error: Attempted to dereference unique_ptr that is NULL!
Stack Trace: .../cache_httpfs.duckdb_extension FileSystem::OpenFile -> WriteAheadLog::Replay -> SingleFileStorageManager::LoadDatabase -> PhysicalAttach::GetData ...
HTTP log (abridged)
We see ATTACH triggers:
HEAD https://s3.ap-northeast-1.amazonaws.com/
GET Range: bytes=0-524287 → 206 Partial Content Then the crash.
Notes & Workarounds
If ATTACH a local path, no crash.
If we do not LOAD cache_httpfs, ATTACH s3://...duckdb also works.
If we ATTACH s3://...duckdb before loading cache_httpfs, it avoids the crash but then .duckdb file handles won’t benefit from cache_httpfs.
Version:
DuckDB core: 1.4.1 (linux_amd64) via Python 3.11
cache_httpfs: matching 1.4.1 (community extension)
S3: Amazon S3, region ap-northeast-1, server-side encryption AES256 enabled.
Environment
Docker linux/amd64, Python 3.11
DuckDB 1.4.1
cache_httpfs from community extensions 1.4.1
OS: Debian-based container
Reproducible consistently
Ask
Confirm whether cache_httpfs currently supports ATTACH-ing remote .duckdb files.
If yes, this crash looks like a null pointer deref in the extension; pointers on a fix or a known-good version would be appreciated.
If not supported yet, can we get a guard/clear error message instead of a crash?
Thanks!
@RyuuSetsuhi / @erikcw I put a fix here: https://github.com/dentiny/duck-read-cache-fs/pull/291 Thanks for reporting and wait!
@erikcw FYI, the segfault comes from
- When you load a database by attaching a remote database file, it reads not only database file, but also attempts to read other files, like WAL; which might not exist
- An implementation detail: In cache filesystem, it logs and caches file handle returned by http filesystem, which is NULL in this case, thus segfault
Hi @RyuuSetsuhi, also a tip for the extension usage.
The extension supports exclusion list, which allows users and applications to not cache certain files. I think duckdb database file might be a good example: (1) it's likely loaded only once; (2) it's large and accounts for large memory or disk space.
Example usage could reference to https://github.com/dentiny/duck-read-cache-fs/blob/615f4f597a2285b87b5d7c1420a750c89d9a4c42/test/sql/cache_exclusion.test#L15-L16
I will close the issue for now, feel free to re-open or create a new issue if it's still an issue. Thank you all!