running error
[ERR] MemoryManager<Serial>::Malloc1D error. [mars01:187617:0:187617] Caught signal 11 (Segmentation fault: address not mapped to object at address 0x37) BFD: Dwarf Error: found dwarf version '5', this reader only handles version 2, 3 and 4 information.
@JieyangChen7 This looks like an MGARD-X error, I think?
@wangwu1991 Thanks very much for the bug report. Can you tell us how you compiled MGARD and what command you were running when you got this segfault?
the segfault was appearing when running './mgard-x' . The MGARD was compiled on HPC of Centos 7, with gcc/11.2.0, cmake/3.23.1, the zstd and the protoc were installed using MGARD/build_scripts/build_mgard_serial.sh (BTW, as a normal user without sudo, the same failure to run MGARD with personal installed zstd and protoc).
@wangwu1991 Thank you for creating this issue. Would you be able to provide us with the input data and parameters you used for compression?
The input data is "testfloat_8_8_128.dat" in SZ2. The full command I used is "./mgard-x -z -i ./testfloat_8_8_128.dat -c testfloat_8_8_128.dat_z -r 1 -t s -n 3 8 8 128 -m rel -e 1e-3 -s 0 -l 3 -d serial -v". Thanks.
@wangwu1991 Sorry about the late reply. We have fixed the problem in #203 and here is an example output of compressing "testfloat_8_8_128.dat". Also, please note that the dimensions should be represented in slowest-to-fastest order.
$mgard-x -z -i testfloat_8_8_128.dat -c testfloat_8_8_128.dat.mgard -r 1 -t s -n 3 128 8 8 -m rel -e 1e-3 -s 0 -l 0 -d serial -v [info] mode: compression [info] original data: /home/jieyang/dev/data/testfloat_8_8_128.dat [info] compressed data: /home/jieyang/dev/data/testfloat_8_8_128.dat.mgard [info] data type: Single precision [info] error bound mode: Relative [info] error bound: 1.000000e-03 [info] s: 0 [info] lossless: Huffman [info] device type: SERIAL [info] Verbose: enabled [info] Loading file: /home/jieyang/dev/data/testfloat_8_8_128.dat [info] Select device: CPU [time] Calculating norm time: 1.0503e-05s [info] L_2 norm: 1.58708 [time] Decomposition time: 0.00586623s [time] Quantization time: 0.000754992s [info] Outlier ratio: 39/8192 (0.476074%) [time] Level Linearizer type: 1 time: 0.000391853s [time] Huffman Compress time: 4.13656s [info] Huffman block size: 20480 [info] Huffman dictionary size: 8192 [info] Huffman compress ratio: 32768/40356 (0.811973) [time] Overall Compress time: 4.14367s [time] Compression Throughput: 7.90797e-06 GB/s [info] Compression ratio: 0.810046 [info] Select device: CPU [time] Level Linearizer type: 1 time: 0.000374763s [time] Huffman Decompress time: 0.000380089s [time] Dequantization time: 0.000674831s [time] Recomposition time: 0.00604179s [time] Overall Decompression time: 0.00719353s [time] Decompression Throughput: 0.00455521 GB/s [info] Relative L_2 error: 5.346481e-04 (Satisified) [info] MSE: 7.200034e-07 [info] PSNR: 75.1234
After compiling with the updated code and using the “mgard-x -z -i testfloat_8_8_128.dat -c testfloat_8_8_128.dat.mgard -r 1 -t s -n 3 128 8 8 -m rel -e 1e-3 -s 0 -l 0 -d serial -v”, the output are that ./mgard-x -z -i testfloat_8_8_128.dat -c testfloat_8_8_128.dat.mgard -r 1 -t s -n 3 128 8 8 -m rel -e 1e-3 -s 0 -l 0 -d serial -v [info] mode: compression [info] original data: testfloat_8_8_128.dat [info] compressed data: testfloat_8_8_128.dat.mgard [info] data type: Single precision [info] error bound mode: Relative [info] error bound: 1.000000e-03 [info] s: 0 [info] lossless: Huffman [info] device type: Serial [info] Verbose: enabled [info] Loading file: testfloat_8_8_128.dat [time] Calculating norm time: 4.661e-05s [info] L_2 norm: 1.58708 [time] Decomposition time: 0.00731331s [time] Quantization time: 0.00113645s [info] Outlier ratio: 39/8192 (0.476074%) [ERR] MemoryManager<Serial>::Malloc1D error. Segmentation fault
When I compiled the same code on my another computer with Ubuntu 18.04 and gcc (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04), the program can be executed smoothly and the results are right (exactly the same with yours). Except for the system and the compiler, everything else is basically the same. I'm very eager to know why.
Since I would like to use this algorithm to incorporate into my own programs, a simplified version (such as without protobuf) will be expected to be provided, so the promotion of the algorithm will be more easily.