zfp icon indicating copy to clipboard operation
zfp copied to clipboard

HIP execution fails on large data

Open lindstro opened this issue 4 years ago • 1 comments

@lindstro Thanks for the response.
I suspect float data type the function should be "frexpf(float,int* )" and double it should be "frexp(double,int*)".

  1. feature/hip-support branch test case Decompressor Fail : ./zfp -d -1 630000000 -r 4 -x hip -i /Data/SDRBENCH-NWChem-dataset/acd-tst.bin.d64 -z test.zfp (compressor) type=double nx=630000000 ny=1 nz=1 nw=1 raw=5040000000 zfp=315000000(compressor) ratio=16 rate=4 -compressor o/p ./ZFP/build/bin# ./zfp -d -1 630000000 -r 4 -x hip -z test.zfp -o test.d64(decompressor)--Fail- Due to Segmentation fault or runtime issue root cause may suspect in function- void decode_ints( )( zfp/src/hip_zfp/decode.cuh) File Info:

ARRAY DIMENSION: 1D number of elements: name size acd-tst 801098891

In case compressor/decompressor command given parameters are wrong, then behavior expected/response with an error but in here compressor success and decompressor fails.

Originally posted by @anilbommareddy in https://github.com/LLNL/zfp/issues/85#issuecomment-769292954

lindstro avatar Feb 03 '21 17:02 lindstro

I wonder if this issue is related to CUDA bug #121, which was recently fixed. Although I would not expect this to result in a segmentation fault. I first need to download the data and see if we can reproduce the issue on our end.

lindstro avatar Feb 03 '21 17:02 lindstro

This appears to be fixed in the staging branch.

GarrettDMorrison avatar Feb 09 '23 19:02 GarrettDMorrison

I confirm that the above commands work (on staging) with the same SDRBench data. Moreover, the serial and HIP backends produce the exact same output, both for compression and decompression.

I'm closing this issue. Please re-open if you're still experiencing problems.

lindstro avatar Jun 14 '23 23:06 lindstro