netcdf-c icon indicating copy to clipboard operation
netcdf-c copied to clipboard

probelm on compressing "64-bit offset" nc

Open shuyingchen1010 opened this issue 3 years ago • 6 comments
trafficstars

Hi all,

I am not sure is here the right place to ask problem about "nccopy", but, I am running climate model with large output data, the default output of this model is "64-bit offset" nc. But the output is way too large so I want to compress them with the command "nccopy". But I got a problem when I tried to compress my model_output file. Here is part of the head of this output file, output from "ncdump -h model_output.nc": dimensions: ncells = 649392 ; vertices = 3 ; height = 26 ; bnds = 2 ; time = UNLIMITED ; // (96 currently) variables: (different variables here)

I did compression by: nccopy -7 model_output.nc tmp.nc nccopy -d1 -s tmp.nc compress.nc I got an error at the first command line: NetCDF: Bad chunk sizes. Location: file ; line 1027

It is the original model output nc data (64-bit offset) which have not been compressed before. I don't know why here is this Bad chunk size. This model_output.nc has about 20 G, hard to upload to any platform. I am using NCO/4.9.5, netCDF/4.7.4, and the super-computation linux system info: VERSION="8.5 (Green Obsidian)" VERSION_ID="8.5"

Any idea? If further information is needed please let me know.

Best regards Shuying

shuyingchen1010 avatar Jun 08 '22 09:06 shuyingchen1010

64-bit offset format files cannot use compression. Only netcdf/HDF5 files can use compression. So first convert the file to netCDF-4, then you will be able to compress it.

edwardhartnett avatar Jun 08 '22 09:06 edwardhartnett

Hello edwardhartnett, thanks for your reply. Actually, the problem comes from the nc type conversion line "nccopy -7 model_output.nc tmp.nc". I was tring to convert "64-bit offset" to "netCDF-4 classic model". I have also tried with "nccopy -4 model_output.nc tmp.nc", so to convert "64-bit offset" to "netCDF-4", it is also doesn't work with the same error: NetCDF: Bad chunk sizes. Location: file ; line 1027 What do you think?

shuyingchen1010 avatar Jun 08 '22 11:06 shuyingchen1010

I believe nccopy allows you to specify chunk sizes from the command line. Have you tried that?

Select chunk sizes the same as dimension sizes, as a first attempt.

edwardhartnett avatar Jun 08 '22 13:06 edwardhartnett

No, even I added speicific chunk size when I did the file convertion to no matter "netCDF-4" or "netCDF-4 classic model", same error as before. [c9@login06 real_tmp]$ nccopy -4 -c ncells/649392,vertices/3,height/26,bnds/2,time/96 model_output.nc test.nc NetCDF: Bad chunk sizes. Location: file ; line 1027

[c9@login06 real_tmp]$ nccopy -7 -c ncells/649392,vertices/3,height/26,bnds/2,time/96 model_output.nc test.nc NetCDF: Bad chunk sizes. Location: file ; line 1027

[c9@login06 real_tmp]$ ncdump -h model_output.nc netcdf model_output { dimensions: ncells = 649392 ; vertices = 3 ; height = 26 ; bnds = 2 ; time = UNLIMITED ; // (96 currently) variables: float clon(ncells) ...

shuyingchen1010 avatar Jun 08 '22 14:06 shuyingchen1010

Ok try an ncells chunk size on 1024

edwardhartnett avatar Jun 08 '22 14:06 edwardhartnett

this idea works in my case, thank you. But the conversion procedure takes almost 10 minute, I guess if I increase the chunk size the costed time will reduce, right? Any rules on choosing the chunk size?

shuyingchen1010 avatar Jun 08 '22 15:06 shuyingchen1010