duckdb-r icon indicating copy to clipboard operation
duckdb-r copied to clipboard

Undefined behavior with list columns

Open krlmlr opened this issue 10 months ago • 12 comments

To reproduce:

  • Clone https://github.com/krlmlr/r-4-duckdb
  • Run docker run --rm -ti --platform linux/amd64 -v $(pwd):/test ghcr.io/krlmlr/r-4-duckdb:main R -q -f test/test.R

Symptoms:

> duckdb::dbAppendTable(con, "test_list", test_list_tbl)
Error: C stack usage  285199482356 is too close to the limit
Error: rapi_execute: Failed to run query
Error: Invalid Error: std::exception
Execution halted

The number varies, it is high on gitpod and on another Linux host (bare metal), hundreds of MB on macOS + colima, the problem can't be reproduced with Docker Desktop on macOS.

Need to understand if this is specific to R 4.0.5: krlmlr/r-4-duckdb#1.

krlmlr avatar Apr 11 '24 14:04 krlmlr

Also an issue with R 4.3.3.

krlmlr avatar Apr 11 '24 16:04 krlmlr

Running with valgrind:

> dbWriteTable(con, "test_list_2_alternative", test_list_2_tbl)
Error: C stack usage  134889705540 is too close to the limit
Error: rapi_execute: Failed to run query
Error: Invalid Error: std::exception
Execution halted
==692== Syscall param pwrite64(buf) points to uninitialised byte(s)
==692==    at 0x4DD883F: __libc_pwrite64 (pwrite64.c:25)
==692==    by 0x4DD883F: pwrite (pwrite64.c:23)
==692==    by 0x925A96D0: duckdb::LocalFileSystem::Write(duckdb::FileHandle&, void*, long, unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x9334A773: duckdb::SingleFileBlockManager::ChecksumAndWrite(duckdb::FileBuffer&, unsigned long) const (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x93373080: duckdb::BlockManager::ConvertToPersistent(long, std::shared_ptr<duckdb::BlockHandle>) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x934269F2: duckdb::ColumnSegment::ConvertToPersistent(duckdb::optional_ptr<duckdb::BlockManager>, long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x93430F04: duckdb::PartialBlockForCheckpoint::Flush(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x9334F776: duckdb::PartialBlockManager::FlushPartialBlocks() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x9336536A: duckdb::SingleFileCheckpointWriter::CreateCheckpoint() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x93365661: duckdb::SingleFileStorageManager::CreateCheckpoint(bool, bool) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x930E73C5: duckdb::AttachedDatabase::Close() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x930E7577: duckdb::DatabaseManager::ResetDatabases(duckdb::unique_ptr<duckdb::TaskScheduler, std::default_delete<duckdb::TaskScheduler>, true>&) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x931021B8: duckdb::DatabaseInstance::~DatabaseInstance() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==  Address 0xa865e731 is 1 bytes inside a block of size 262,144 alloc'd
==692==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==692==    by 0x9259DB4B: duckdb::Allocator::AllocateData(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x9259DC5F: duckdb::FileBuffer::ReallocBuffer(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x9259DC90: duckdb::FileBuffer::Resize(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x93345A44: duckdb::StandardBufferManager::ConstructManagedBuffer(unsigned long, duckdb::unique_ptr<duckdb::FileBuffer, std::default_delete<duckdb::FileBuffer>, true>&&, duckdb::FileBufferType) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x93358E6E: duckdb::StandardBufferManager::RegisterMemory(duckdb::MemoryTag, unsigned long, bool) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x933593C5: duckdb::StandardBufferManager::Allocate(duckdb::MemoryTag, unsigned long, bool, std::shared_ptr<duckdb::BlockHandle>*) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x9342C269: duckdb::ColumnSegment::CreateTransientSegment(duckdb::DatabaseInstance&, duckdb::LogicalType const&, unsigned long, unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x93379B43: duckdb::unique_ptr<duckdb::CompressionState, std::default_delete<duckdb::CompressionState>, true> duckdb::AlpInitCompression<double>(duckdb::ColumnDataCheckpointer&, duckdb::unique_ptr<duckdb::AnalyzeState, std::default_delete<duckdb::AnalyzeState>, true>) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x9343AAAB: duckdb::ColumnDataCheckpointer::WriteToDisk() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x9343C8F4: duckdb::ColumnData::Checkpoint(duckdb::RowGroup&, duckdb::PartialBlockManager&, duckdb::ColumnCheckpointInfo&) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692==    by 0x9343CEB3: duckdb::StandardColumnData::Checkpoint(duckdb::RowGroup&, duckdb::PartialBlockManager&, duckdb::ColumnCheckpointInfo&) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==692== 

krlmlr avatar Apr 11 '24 17:04 krlmlr

Another one, very similar:

> duckdb::dbAppendTable(con, "test_list", test_list_tbl)
Error: C stack usage  134847741652 is too close to the limit
Error: rapi_execute: Failed to run query
Error: Invalid Error: std::exception
Execution halted
==824== Syscall param pwrite64(buf) points to uninitialised byte(s)
==824==    at 0x4DD883F: __libc_pwrite64 (pwrite64.c:25)
==824==    by 0x4DD883F: pwrite (pwrite64.c:23)
==824==    by 0x925A96D0: duckdb::LocalFileSystem::Write(duckdb::FileHandle&, void*, long, unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x9334A773: duckdb::SingleFileBlockManager::ChecksumAndWrite(duckdb::FileBuffer&, unsigned long) const (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x93373080: duckdb::BlockManager::ConvertToPersistent(long, std::shared_ptr<duckdb::BlockHandle>) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x934269F2: duckdb::ColumnSegment::ConvertToPersistent(duckdb::optional_ptr<duckdb::BlockManager>, long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x93430F04: duckdb::PartialBlockForCheckpoint::Flush(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x9334F776: duckdb::PartialBlockManager::FlushPartialBlocks() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x9336536A: duckdb::SingleFileCheckpointWriter::CreateCheckpoint() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x93365661: duckdb::SingleFileStorageManager::CreateCheckpoint(bool, bool) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x930E73C5: duckdb::AttachedDatabase::Close() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x930E7577: duckdb::DatabaseManager::ResetDatabases(duckdb::unique_ptr<duckdb::TaskScheduler, std::default_delete<duckdb::TaskScheduler>, true>&) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x931021B8: duckdb::DatabaseInstance::~DatabaseInstance() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==  Address 0x9fab7f31 is 1 bytes inside a block of size 262,144 alloc'd
==824==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==824==    by 0x9259DB4B: duckdb::Allocator::AllocateData(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x9259DC5F: duckdb::FileBuffer::ReallocBuffer(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x9259DC90: duckdb::FileBuffer::Resize(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x93345A44: duckdb::StandardBufferManager::ConstructManagedBuffer(unsigned long, duckdb::unique_ptr<duckdb::FileBuffer, std::default_delete<duckdb::FileBuffer>, true>&&, duckdb::FileBufferType) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x93358E6E: duckdb::StandardBufferManager::RegisterMemory(duckdb::MemoryTag, unsigned long, bool) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x933593C5: duckdb::StandardBufferManager::Allocate(duckdb::MemoryTag, unsigned long, bool, std::shared_ptr<duckdb::BlockHandle>*) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x9342C269: duckdb::ColumnSegment::CreateTransientSegment(duckdb::DatabaseInstance&, duckdb::LogicalType const&, unsigned long, unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x93379B43: duckdb::unique_ptr<duckdb::CompressionState, std::default_delete<duckdb::CompressionState>, true> duckdb::AlpInitCompression<double>(duckdb::ColumnDataCheckpointer&, duckdb::unique_ptr<duckdb::AnalyzeState, std::default_delete<duckdb::AnalyzeState>, true>) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x9343AAAB: duckdb::ColumnDataCheckpointer::WriteToDisk() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x9343C8F4: duckdb::ColumnData::Checkpoint(duckdb::RowGroup&, duckdb::PartialBlockManager&, duckdb::ColumnCheckpointInfo&) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824==    by 0x9343CEB3: duckdb::StandardColumnData::Checkpoint(duckdb::RowGroup&, duckdb::PartialBlockManager&, duckdb::ColumnCheckpointInfo&) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==824== 

@hannes: Have you recently seen undefined behavior with persisting nested data? I'll check the latest and greatest too.

krlmlr avatar Apr 11 '24 17:04 krlmlr

Still an issue with https://github.com/duckdb/duckdb-r/commit/953e7d320c6678109cedb50c721f19c264a839c1 (aka https://github.com/duckdb/duckdb/commit/e5f45238740e8c8fef3d22ee61f6b7faf950c05d), installed from r-universe:

> duckdb::dbAppendTable(con, "test_list_2", test_list_2_tbl)
Error: C stack usage  134807170756 is too close to the limit
Error: rapi_execute: Failed to run query
Error: Invalid Error: std::exception
Execution halted
==948== Syscall param pwrite64(buf) points to uninitialised byte(s)
==948==    at 0x4DD883F: __libc_pwrite64 (pwrite64.c:25)
==948==    by 0x4DD883F: pwrite (pwrite64.c:23)
==948==    by 0x9248CB00: duckdb::LocalFileSystem::Write(duckdb::FileHandle&, void*, long, unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x931EACA4: duckdb::BlockManager::ConvertToPersistent(long, std::shared_ptr<duckdb::BlockHandle>) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x9329D311: duckdb::ColumnSegment::ConvertToPersistent(duckdb::optional_ptr<duckdb::BlockManager>, long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x932A6726: duckdb::PartialBlockForCheckpoint::Flush(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x931C5F06: duckdb::PartialBlockManager::FlushPartialBlocks() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x931DE01D: duckdb::SingleFileCheckpointWriter::CreateCheckpoint() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x931DE44F: duckdb::SingleFileStorageManager::CreateCheckpoint(bool, bool) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x92F6FD85: duckdb::AttachedDatabase::Close() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x92F702F7: duckdb::DatabaseManager::ResetDatabases(duckdb::unique_ptr<duckdb::TaskScheduler, std::default_delete<duckdb::TaskScheduler>, true>&) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x92F703C4: duckdb::DatabaseInstance::~DatabaseInstance() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x923CF739: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==  Address 0xa8524f31 is 1 bytes inside a block of size 262,144 alloc'd
==948==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==948==    by 0x924848F5: duckdb::Allocator::AllocateData(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x924849FF: duckdb::FileBuffer::ReallocBuffer(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x92484A30: duckdb::FileBuffer::Resize(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x931C646C: duckdb::StandardBufferManager::ConstructManagedBuffer(unsigned long, duckdb::unique_ptr<duckdb::FileBuffer, std::default_delete<duckdb::FileBuffer>, true>&&, duckdb::FileBufferType) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x931D7E11: duckdb::StandardBufferManager::RegisterMemory(duckdb::MemoryTag, unsigned long, bool) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x931D7FF5: duckdb::StandardBufferManager::Allocate(duckdb::MemoryTag, unsigned long, bool, std::shared_ptr<duckdb::BlockHandle>*) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x932A1F71: duckdb::ColumnSegment::CreateTransientSegment(duckdb::DatabaseInstance&, duckdb::LogicalType const&, unsigned long, unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x931F2371: duckdb::AlpCompressionState<double>::CreateEmptySegment(unsigned long) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x931F26A4: duckdb::unique_ptr<duckdb::CompressionState, std::default_delete<duckdb::CompressionState>, true> duckdb::AlpInitCompression<double>(duckdb::ColumnDataCheckpointer&, duckdb::unique_ptr<duckdb::AnalyzeState, std::default_delete<duckdb::AnalyzeState>, true>) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x932AB186: duckdb::ColumnDataCheckpointer::WriteToDisk() (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948==    by 0x932AC21C: duckdb::ColumnData::Checkpoint(duckdb::RowGroup&, duckdb::PartialBlockManager&, duckdb::ColumnCheckpointInfo&) (in /usr/local/lib/R/site-library/duckdb/libs/duckdb.so)
==948== 

krlmlr avatar Apr 11 '24 17:04 krlmlr

Can't repro with the CLI:

CREATE TABLE test_list_2 (a integer, b STRUCT(c VARCHAR[], d VARCHAR[], e INTEGER[]));
INSERT INTO test_list_2 VALUES (1, row(['a', 'b', 'c', 'd', 'e', 'f'], ['A', 'B'], [1, 5, 9]));
SELECT * FROM test_list_2;

krlmlr avatar Apr 11 '24 17:04 krlmlr

But with valgrind :

==25532== Syscall param pwrite64(buf) points to uninitialised byte(s)
==25532==    at 0x4CB083F: __libc_pwrite64 (pwrite64.c:25)
==25532==    by 0x4CB083F: pwrite (pwrite64.c:23)
==25532==    by 0x8ABF00: duckdb::LocalFileSystem::Write(duckdb::FileHandle&, void*, long, unsigned long) (in /duckdb/duckdb)
==25532==    by 0xB578C8: duckdb::BlockManager::ConvertToPersistent(long, std::shared_ptr<duckdb::BlockHandle>) (in /duckdb/duckdb)
==25532==    by 0xC1840F: duckdb::ColumnSegment::ConvertToPersistent(duckdb::optional_ptr<duckdb::BlockManager>, long) (in /duckdb/duckdb)
==25532==    by 0xC2583C: duckdb::PartialBlockForCheckpoint::Flush(unsigned long) (in /duckdb/duckdb)
==25532==    by 0xC57316: duckdb::PartialBlockManager::FlushPartialBlocks() (in /duckdb/duckdb)
==25532==    by 0xC6F9B2: duckdb::SingleFileCheckpointWriter::CreateCheckpoint() (in /duckdb/duckdb)
==25532==    by 0xC6FC58: duckdb::SingleFileStorageManager::CreateCheckpoint(bool, bool) (in /duckdb/duckdb)
==25532==    by 0xAF755E: duckdb::AttachedDatabase::Close() (in /duckdb/duckdb)
==25532==    by 0xAF7AC7: duckdb::DatabaseManager::ResetDatabases(duckdb::unique_ptr<duckdb::TaskScheduler, std::default_delete<duckdb::TaskScheduler>, true>&) (in /duckdb/duckdb)
==25532==    by 0xAF7B64: duckdb::DatabaseInstance::~DatabaseInstance() (in /duckdb/duckdb)
==25532==    by 0xAEB399: duckdb::DuckDB::~DuckDB() (in /duckdb/duckdb)
==25532==  Address 0x5c008c0 is in a rw- anonymous segment

I'll open an issue upstream.

krlmlr avatar Apr 11 '24 17:04 krlmlr

@TSchiefer: can you confirm the current development version (installable as a binary from r-universe) fixes the issue?

krlmlr avatar Apr 16 '24 16:04 krlmlr

I installed package version 0.10.1.9001 from r-universe today (I see that version bump is from 3 weeks ago), but it was installed from source, not entirely sure that it's the correct version & error is unfortunately still occurring. I then proceeded to install from source from GitHub directly, but the test script is still failing, albeit it seems that it consistently works until a later step in the script than before.

TSchiefer avatar Apr 17 '24 11:04 TSchiefer

it might be intentional, AFAIK we don't zero blocks for performance reasons and if there's garbage at the end it just gets written to disk. But need to confirm.

hannes avatar Apr 18 '24 10:04 hannes

@TSchiefer: can you please share a reproducible example for the issues you are seeing now?

krlmlr avatar Apr 24 '24 05:04 krlmlr

Thanks, I received an example.

The original test still fails with v0.10.3, testing against bleeding-edge duckdb now.

krlmlr avatar Jun 02 '24 20:06 krlmlr

I can confirm that even the original issue still persists after vendoring the most recent duckdb (duckdb/duckdb@c3903e44e0d70f4fae6d9c068c0792ebc62b882f). I'll bring this upstream again.

krlmlr avatar Jun 04 '24 05:06 krlmlr