fletcher
fletcher copied to clipboard
Stringread example throws std::bad_alloc
stringread]# fletchgen -r names.rb -s memory.srec -l vhdl --sim [INFO ]: Loading RecordBatch(es) from names.rb terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
I'm not able to reproduce this.
Could you please run this in gdb and post a backtrace?
The example seems to run in CI (on #284): https://github.com/abs-tudelft/fletcher/runs/5535140660?check_suite_focus=true#step:7:12
I'm not familiar with C++ . I test it in Centos7 And Arrow 7.0
Starting program: /usr/local/bin/fletchgen -r names.rb -s memory.srec -l vhdl --axi warning: File "/usr/lib64/libstdc++.so.6.0.27-gdb.py" auto-loading has been declined by your `auto-load safe-path' set to "$debugdir:$datadir/auto-load:/usr/bin/mono-gdb.py". To enable execution of this file add add-auto-load-safe-path /usr/lib64/libstdc++.so.6.0.27-gdb.py line to your configuration file "/root/.gdbinit". To completely disable this security protection add set auto-load safe-path / line to your configuration file "/root/.gdbinit". For more information about this security protection see the "Auto-loading safe path" section in the GDB manual. E.g., run from the shell: info "(gdb)Auto-loading safe path" [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". [New Thread 0x7ffff19ff700 (LWP 520)] [INFO ]: Loading RecordBatch(es) from names.rb [New Thread 0x7ffff11fe700 (LWP 521)] [New Thread 0x7ffff09fd700 (LWP 522)] [New Thread 0x7fffebbff700 (LWP 524)] terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
Program received signal SIGABRT, Aborted.
0x00007ffff5102387 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install arrow-libs-7.0.0-1.el7.x86_64 brotli-1.0.7-5.el7.x86_64 bzip2-libs-1.0.6-13.el7.x86_64 glibc-2.17-325.el7_9.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-51.el7_9.x86_64 libcom_err-1.42.9-19.el7.x86_64 libselinux-2.5-15.el7.x86_64 libzstd-1.5.2-1.el7.x86_64 lz4-1.8.3-1.el7.x86_64 openssl-libs-1.0.2k-24.el7_9.x86_64 pcre-8.32-17.el7.x86_64 snappy-1.1.0-3.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64
(gdb) bt
#0 0x00007ffff5102387 in raise () from /lib64/libc.so.6
#1 0x00007ffff5103a78 in abort () from /lib64/libc.so.6
#2 0x00007ffff5c69823 in __gnu_cxx::__verbose_terminate_handler () at ../../.././libstdc++-v3/libsupc++/vterminate.cc:95
#3 0x00007ffff5c75446 in __cxxabiv1::__terminate(void ()()) () at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:47
#4 0x00007ffff5c75491 in std::terminate () at ../../.././libstdc++-v3/libsupc++/eh_terminate.cc:57
#5 0x00007ffff5c756c4 in __cxxabiv1::__cxa_throw (obj=
Could you try to use the names.rb
file from this branch and see if that fixes the issue?
https://github.com/abs-tudelft/fletcher/tree/bad_alloc
Thanks!
It does not work: [INFO ]: Loading RecordBatch(es) from names.rb terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
My Gcc Version is 8.3.1
From the backtrace I see that somewhere deep down in the Arrow code its trying to allocate a very large array
0x00007ffff5c694be in operator new (sz=18446744073709551608) at ../../.././libstdc++-v3/libsupc++/new_op.cc:54
This leads me to believe that the recordbatch file is somehow corrupt.
Does this problem persist when supplying Fletchgen with other recordbatches as well?
I test the Hobbits.rb file get the same error: fletchgen -r Hobbits.rb -s memory.srec -l vhdl --axi [INFO ]: Loading RecordBatch(es) from Hobbits.rb terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
If you add this line to common/cpp/src/fletcher/arrow-utils.cc:244
std::cout << file->Read(file->GetSize().ValueOrDie()).ValueOrDie()->ToHexString() << std::endl;
What is being printed?
It prints nothing
That is weird. If the file were empty, I would expect the following error:
[ERROR]: Could not open RecordBatchFileReader. ARROW:[Invalid: File is too small: 0]
Just to be sure, did you recompile after adding the line?
It seems the error arises in the "arrow::ipc::RecordBatchFileReader::Open(file)" before it returns file_result
Sorry, I had the wrong line number there. Can you plug it in on line 236, just after:
std::shared_ptr<arrow::io::ReadableFile> file = result.ValueOrDie();
The Print Result is: 4152524F57310000240100001000000000000A000E000600050008000A000000000103001000000000000A000C000000040008000A000000700000000400000002000000340000000400000068FFFFFF08000000180000000D000000666C6574636865725F6D6F646500000004000000726561640000000094FFFFFF08000000180000000D000000666C6574636865725F6E616D650000000A000000537472696E6752656164000001000000180000000000120018000800000007000C000000100014001200000000000005540000004C0000004000000004000000010000000C00000008000C00040008000800000008000000180000000C000000666C6574636865725F657063000000000100000034000000000000000400040004000000040000004E616D6500000000000000009C00000014000000000000000C0016000600050008000C000C0000000003030018000000F80000000000000000000A0018000C00040008000A0000004C000000100000001A00000000000000000000000300000000000000000000000000000000000000000000000000000070000000000000007000000000000000880000000000000000000000010000001A000000000000000000000000000000000000000000000005000000080000000D00000012000000150000001A0000001F000000240000002A0000002E000000330000003A0000003E00000042000000480000004D00000052000000580000005D00000063000000660000006C00000071000000770000007E0000008500000000000000416C696365426F624361726F6C44617669644576654672616E6B4772616365486172727949736F6C64654A61636B4B6172656E4C656F6E6172644D6172794E69636B4F6C6976696150657465725175696E6E526F626572745361726168547261766973556D61566963746F7257656E64795861766965725961736D696E655A616368617279000000100000000C001400060008000C0010000C00000000000300400000002800000004000000010000003001000000000000A000000000000000F80000000000000000000000000000000000000000000A000C000000040008000A000000700000000400000002000000340000000400000068FFFFFF08000000180000000D000000666C6574636865725F6D6F646500000004000000726561640000000094FFFFFF08000000180000000D000000666C6574636865725F6E616D650000000A000000537472696E6752656164000001000000180000000000120018000800000007000C000000100014001200000000000005540000004C0000004000000004000000010000000C00000008000C00040008000800000008000000180000000C000000666C6574636865725F657063000000000100000034000000000000000400040004000000040000004E616D6500000000500100004152524F5731
Alright, thanks. The file looks to be loaded properly there...
Could you please describe how you've built and/or installed Arrow?
Thanks.
I test it in CentOS7, I install Arrow in this way: sudo yum install -y epel-release || sudo yum install -y https://dl.fedoraproject.org/pub/epel/epel-release-latest-$(cut -d: -f5 /etc/system-release-cpe | cut -d. -f1).noarch.rpm sudo yum install -y https://apache.jfrog.io/artifactory/arrow/centos/$(cut -d: -f5 /etc/system-release-cpe | cut -d. -f1)/apache-arrow-release-latest.rpm sudo yum install -y --enablerepo=epel arrow-devel # For C++ sudo yum install -y --enablerepo=epel arrow-glib-devel # For GLib (C) sudo yum install -y --enablerepo=epel arrow-dataset-devel # For Apache Arrow Dataset C++ sudo yum install -y --enablerepo=epel parquet-devel # For Apache Parquet C++ sudo yum install -y --enablerepo=epel parquet-glib-devel # For Apache Parquet GLib (C)
As the https://arrow.apache.org/install/ described
I also find the same question.
I'm afraid that I can't do much more than this without being able to reproduce the issue myself.
If I were to be able to reproduce this, I would go down the backtrace in a debugger and try to verify that all variables involved in loading the file have the right values.
If that's not something you can do, perhaps it would be possible to set up a Docker image mimicking your environment, and see if you can reproduce it there? If that's the case, you can pass me the image and I can take a look.