fletcher icon indicating copy to clipboard operation
fletcher copied to clipboard

Stringread example throws std::bad_alloc

Open yuqi-ali opened this issue 2 years ago • 19 comments

stringread]# fletchgen -r names.rb -s memory.srec -l vhdl --sim [INFO ]: Loading RecordBatch(es) from names.rb terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

yuqi-ali avatar Mar 14 '22 08:03 yuqi-ali

I'm not able to reproduce this.

Could you please run this in gdb and post a backtrace?

johanpel avatar Mar 14 '22 09:03 johanpel

The example seems to run in CI (on #284): https://github.com/abs-tudelft/fletcher/runs/5535140660?check_suite_focus=true#step:7:12

mbrobbel avatar Mar 14 '22 09:03 mbrobbel

I'm not familiar with C++ . I test it in Centos7 And Arrow 7.0

yuqi-ali avatar Mar 16 '22 03:03 yuqi-ali

Starting program: /usr/local/bin/fletchgen -r names.rb -s memory.srec -l vhdl --axi warning: File "/usr/lib64/libstdc++.so.6.0.27-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load:/usr/bin/mono-gdb.py". To enable execution of this file add add-auto-load-safe-path /usr/lib64/libstdc++.so.6.0.27-gdb.py line to your configuration file "/root/.gdbinit". To completely disable this security protection add set auto-load safe-path / line to your configuration file "/root/.gdbinit". For more information about this security protection see the "Auto-loading safe path" section in the GDB manual. E.g., run from the shell: info "(gdb)Auto-loading safe path" [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7ffff19ff700 (LWP 520)] [INFO ]: Loading RecordBatch(es) from names.rb [New Thread 0x7ffff11fe700 (LWP 521)] [New Thread 0x7ffff09fd700 (LWP 522)] [New Thread 0x7fffebbff700 (LWP 524)] terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

Program received signal SIGABRT, Aborted. 0x00007ffff5102387 in raise () from /lib64/libc.so.6 Missing separate debuginfos, use: debuginfo-install arrow-libs-7.0.0-1.el7.x86_64 brotli-1.0.7-5.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 glibc-2.17-325.el7_9.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-51.el7_9.x86_64 libcom_err-1.42.9-19.el7.x86_64 libselinux-2.5-15.el7.x86_64 libzstd-1.5.2-1.el7.x86_64 lz4-1.8.3-1.el7.x86_64 openssl-libs-1.0.2k-24.el7_9.x86_64 pcre-8.32-17.el7.x86_64 snappy-1.1.0-3.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64 (gdb) bt #0 0x00007ffff5102387 in raise () from /lib64/libc.so.6 #1 0x00007ffff5103a78 in abort () from /lib64/libc.so.6 #2 0x00007ffff5c69823 in __gnu_cxx::__verbose_terminate_handler () at ../../.././libstdc++-v3/libsupc++/vterminate.cc:95 #3 0x00007ffff5c75446 in __cxxabiv1::__terminate(void ()()) () at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:47 #4 0x00007ffff5c75491 in std::terminate () at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:57 #5 0x00007ffff5c756c4 in __cxxabiv1::__cxa_throw (obj=, tinfo=0x7ffff5f9afa0 , dest=0x7ffff5c73bd0 std::bad_alloc::~bad_alloc()) at ../../.././libstdc++-v3/libsupc++/eh_throw.cc:95 #6 0x00007ffff5c694be in operator new (sz=18446744073709551608) at ../../.././libstdc++-v3/libsupc++/new_op.cc:54 #7 0x00007ffff66f86b7 in void std::vector<std::string, std::allocatorstd::string >::_M_emplace_back_auxstd::string(std::string&&) () from /lib64/libarrow.so.700 #8 0x00007ffff683d242 in arrow::KeyValueMetadata::Append(std::string, std::string) () from /lib64/libarrow.so.700 #9 0x00007ffff733f6e5 in arrow::ipc::internal::GetKeyValueMetadata(flatbuffers::Vector<flatbuffers::Offsetorg::apache::arrow::flatbuf::KeyValue > const, std::shared_ptrarrow::KeyValueMetadata) () from /lib64/libarrow.so.700 #10 0x00007ffff7346e09 in arrow::ipc::internal::(anonymous namespace)::FieldFromFlatbuffer(org::apache::arrow::flatbuf::Field const, arrow::ipc::internal::FieldPosition, arrow::ipc::DictionaryMemo*, std::shared_ptrarrow::Field) () from /lib64/libarrow.so.700 #11 0x00007ffff7347c70 in arrow::ipc::internal::GetSchema(void const, arrow::ipc::DictionaryMemo*, std::shared_ptrarrow::Schema) () from /lib64/libarrow.so.700 #12 0x00007ffff735615c in arrow::ipc::UnpackSchemaMessage(void const, arrow::ipc::IpcReadOptions const&, arrow::ipc::DictionaryMemo*, std::shared_ptrarrow::Schema, std::shared_ptrarrow::Schema, std::vector<bool, std::allocator >, bool) () from /lib64/libarrow.so.700 #13 0x00007ffff736bf39 in arrow::ipc::RecordBatchFileReaderImpl::Open(arrow::io::RandomAccessFile*, long, arrow::ipc::IpcReadOptions const&) () from /lib64/libarrow.so.700 #14 0x00007ffff7357230 in arrow::ipc::RecordBatchFileReader::Open(std::shared_ptrarrow::io::RandomAccessFile const&, long, arrow::ipc::IpcReadOptions const&) () from /lib64/libarrow.so.700 #15 0x00007ffff7357454 in arrow::ipc::RecordBatchFileReader::Open(std::shared_ptrarrow::io::RandomAccessFile const&, arrow::ipc::IpcReadOptions const&) () from /lib64/libarrow.so.700 #16 0x000000000057ed6c in fletcher::ReadRecordBatchesFromFile(std::string const&, std::vector<std::shared_ptrarrow::RecordBatch, std::allocator<std::shared_ptrarrow::RecordBatch > >*) () #17 0x0000000000452b1b in fletchgen::Options::LoadRecordBatches (this=0xa059c0) at /root/fletcher_Gen/fletcher/codegen/cpp/fletchgen/src/fletchgen/options.cc:162 #18 0x00000000004c85b3 in fletchgen::fletchgen (argc=8, argv=0x7fffffffe0f8) at /root/fletcher_Gen/fletcher/codegen/cpp/fletchgen/src/fletchgen/fletchgen.cc:63 #19 0x0000000000413792 in main (argc=8, argv=0x7fffffffe0f8) at /root/fletcher_Gen/fletcher/codegen/cpp/fletchgen/src/fletchgen/main.cc:18 (gdb)

yuqi-ali avatar Mar 16 '22 03:03 yuqi-ali

Could you try to use the names.rb file from this branch and see if that fixes the issue? https://github.com/abs-tudelft/fletcher/tree/bad_alloc Thanks!

johanpel avatar Mar 16 '22 10:03 johanpel

It does not work: [INFO ]: Loading RecordBatch(es) from names.rb terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

yuqi-ali avatar Mar 17 '22 01:03 yuqi-ali

My Gcc Version is 8.3.1

yuqi-ali avatar Mar 17 '22 01:03 yuqi-ali

From the backtrace I see that somewhere deep down in the Arrow code its trying to allocate a very large array

0x00007ffff5c694be in operator new (sz=18446744073709551608) at ../../.././libstdc++-v3/libsupc++/new_op.cc:54

This leads me to believe that the recordbatch file is somehow corrupt.

Does this problem persist when supplying Fletchgen with other recordbatches as well?

johanpel avatar Mar 17 '22 16:03 johanpel

I test the Hobbits.rb file get the same error: fletchgen -r Hobbits.rb -s memory.srec -l vhdl --axi [INFO ]: Loading RecordBatch(es) from Hobbits.rb terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc

yuqi-ali avatar Mar 18 '22 01:03 yuqi-ali

If you add this line to common/cpp/src/fletcher/arrow-utils.cc:244

  std::cout << file->Read(file->GetSize().ValueOrDie()).ValueOrDie()->ToHexString() << std::endl;

What is being printed?

johanpel avatar Mar 18 '22 15:03 johanpel

It prints nothing

yuqi-ali avatar Mar 21 '22 05:03 yuqi-ali

That is weird. If the file were empty, I would expect the following error:

[ERROR]: Could not open RecordBatchFileReader. ARROW:[Invalid: File is too small: 0]

Just to be sure, did you recompile after adding the line?

johanpel avatar Mar 21 '22 08:03 johanpel

It seems the error arises in the "arrow::ipc::RecordBatchFileReader::Open(file)" before it returns file_result

yuqi-ali avatar Mar 21 '22 11:03 yuqi-ali

Sorry, I had the wrong line number there. Can you plug it in on line 236, just after:

  std::shared_ptr<arrow::io::ReadableFile> file = result.ValueOrDie();

johanpel avatar Mar 21 '22 11:03 johanpel

The Print Result is: 4152524F57310000240100001000000000000A000E000600050008000A000000000103001000000000000A000C000000040008000A000000700000000400000002000000340000000400000068FFFFFF08000000180000000D000000666C6574636865725F6D6F646500000004000000726561640000000094FFFFFF08000000180000000D000000666C6574636865725F6E616D650000000A000000537472696E6752656164000001000000180000000000120018000800000007000C000000100014001200000000000005540000004C0000004000000004000000010000000C00000008000C00040008000800000008000000180000000C000000666C6574636865725F657063000000000100000034000000000000000400040004000000040000004E616D6500000000000000009C00000014000000000000000C0016000600050008000C000C0000000003030018000000F80000000000000000000A0018000C00040008000A0000004C000000100000001A00000000000000000000000300000000000000000000000000000000000000000000000000000070000000000000007000000000000000880000000000000000000000010000001A000000000000000000000000000000000000000000000005000000080000000D00000012000000150000001A0000001F000000240000002A0000002E000000330000003A0000003E00000042000000480000004D00000052000000580000005D00000063000000660000006C00000071000000770000007E0000008500000000000000416C696365426F624361726F6C44617669644576654672616E6B4772616365486172727949736F6C64654A61636B4B6172656E4C656F6E6172644D6172794E69636B4F6C6976696150657465725175696E6E526F626572745361726168547261766973556D61566963746F7257656E64795861766965725961736D696E655A616368617279000000100000000C001400060008000C0010000C00000000000300400000002800000004000000010000003001000000000000A000000000000000F80000000000000000000000000000000000000000000A000C000000040008000A000000700000000400000002000000340000000400000068FFFFFF08000000180000000D000000666C6574636865725F6D6F646500000004000000726561640000000094FFFFFF08000000180000000D000000666C6574636865725F6E616D650000000A000000537472696E6752656164000001000000180000000000120018000800000007000C000000100014001200000000000005540000004C0000004000000004000000010000000C00000008000C00040008000800000008000000180000000C000000666C6574636865725F657063000000000100000034000000000000000400040004000000040000004E616D6500000000500100004152524F5731

yuqi-ali avatar Mar 21 '22 12:03 yuqi-ali

Alright, thanks. The file looks to be loaded properly there...

Could you please describe how you've built and/or installed Arrow?

Thanks.

johanpel avatar Mar 21 '22 14:03 johanpel

I test it in CentOS7, I install Arrow in this way: sudo yum install -y epel-release || sudo yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-$(cut -d: -f5 /etc/system-release-cpe | cut -d. -f1).noarch.rpm sudo yum install -y https://apache.jfrog.io/artifactory/arrow/centos/$(cut -d: -f5 /etc/system-release-cpe | cut -d. -f1)/apache-arrow-release-latest.rpm sudo yum install -y --enablerepo=epel arrow-devel # For C++ sudo yum install -y --enablerepo=epel arrow-glib-devel # For GLib (C) sudo yum install -y --enablerepo=epel arrow-dataset-devel # For Apache Arrow Dataset C++ sudo yum install -y --enablerepo=epel parquet-devel # For Apache Parquet C++ sudo yum install -y --enablerepo=epel parquet-glib-devel # For Apache Parquet GLib (C)

As the https://arrow.apache.org/install/ described

yuqi-ali avatar Mar 22 '22 01:03 yuqi-ali

I also find the same question.

yaoye-ali avatar Mar 22 '22 09:03 yaoye-ali

I'm afraid that I can't do much more than this without being able to reproduce the issue myself.

If I were to be able to reproduce this, I would go down the backtrace in a debugger and try to verify that all variables involved in loading the file have the right values.

If that's not something you can do, perhaps it would be possible to set up a Docker image mimicking your environment, and see if you can reproduce it there? If that's the case, you can pass me the image and I can take a look.

johanpel avatar Mar 25 '22 11:03 johanpel