netcdf-java icon indicating copy to clipboard operation
netcdf-java copied to clipboard

Slowness in apply scale-factor/add_offset in ncj5

Open ghansham opened this issue 3 years ago • 3 comments

Dear All

It seems using applying scale_factor and add_offset in the given below method is very slow. Reading a scaled variable (size :11kx11k, datatype:short, add_offset/scale_factor of float data type) in ncj4 is atleast 3.5 times faster than in ncj5.

https://github.com/Unidata/netcdf-java/blob/f702951916ad24ae09d329454c85eba07371246f/cdm/core/src/main/java/ucar/nc2/dataset/EnhanceScaleMissingUnsignedImpl.java#L582

Please refer to

[netCDFJava #MXE-391037]: reading performance b/w ncj4 and ncj5 sent to [email protected] for detailed observations.

Regards Ghansham

ghansham avatar Jun 09 '22 02:06 ghansham

I don't think it should be labelled as help needed. It's more kind of a scope of improvement.

ghansham avatar Jun 10 '22 02:06 ghansham

Gentle reminder than netCDF-Java is primarily a community-driven project. Unidata does its best to develop and maintain the project, but with limited resources, we depend on contributions from the community as well. Since the Unidata development team will not be able to dedicate time to performance investigations in the immediate future, I have labeled this issue as "help wanted".

haileyajohnson avatar Jun 10 '22 17:06 haileyajohnson

Okay

On Fri, 10 Jun, 2022, 22:34 haileyajohnson, @.***> wrote:

Gentle reminder than netCDF-Java is primarily a community-driven project. Unidata does its best to develop and maintain the project, but with limited resources, we depend on contributions from the community as well. Since the Unidata development team will not be able to dedicate time to performance investigations in the immediate future, I have labeled this issue as "help wanted".

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-java/issues/1027#issuecomment-1152564245, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJBGHYV2DLKNYTNOG2LVONYS5ANCNFSM5YIOHM3A . You are receiving this because you authored the thread.Message ID: @.***>

ghansham avatar Jun 11 '22 01:06 ghansham

Number boxing introduced in c52e88c6028bf4aab267803556d7d052f5f0b245 is likely the cause of this.

JMdoubleU avatar Mar 11 '23 04:03 JMdoubleU

That may be the reason. One basic reason is that before applying scale factor, the un scaled value should be checked for missing value test. The reason is that scale offset application is very costly in terms of time.

Regards Ghansham

On Sat, 11 Mar, 2023, 10:05 Jacob Wood, @.***> wrote:

Number boxing introduced in c52e88c https://github.com/Unidata/netcdf-java/commit/c52e88c6028bf4aab267803556d7d052f5f0b245 is likely the cause of this.

— Reply to this email directly, view it on GitHub https://github.com/Unidata/netcdf-java/issues/1027#issuecomment-1464823240, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAYXFJCOTXGWXPC3XFZL5ZDW3P6HVANCNFSM5YIOHM3A . You are receiving this because you authored the thread.Message ID: @.***>

ghansham avatar Mar 11 '23 12:03 ghansham