bspline-fortran icon indicating copy to clipboard operation
bspline-fortran copied to clipboard

Many array temporaries created

Open thbtppl opened this issue 5 years ago • 4 comments

In bspline_sub_module.f90, 'many' array temporaries are created. When compiling with GCC 7.3.0 and the warning flag -Warray-temporaries, most of them intervene in the calls to dintrv, dbvalu, dbknot and dbtpcf With just runtime checks, -fcheck=all, they arise for the "fcn" argument of dbtpcf.

I have no idea how harmelss those runtime array temporaries are but since I am incorporating bspline-fortran within a large 3D code I would like to avoid memory fragmentation as much as possible and not exhaust my stack.

I managed to eliminate most of the array temporaries by changing explicit-shape dummy arguments to assumed-shape dummy arguments in most of the routines and getting rid of some assumed-size dummy arrays. But problem remains for "fcn" and "bcoef" in dbtpcf.

I assume it would be tricky to change everything and it would require some copying to adapt the routines for 1D and 3D as the library was originally written in 2D, but I thought assumed-size arrays should not be used in modern Fortran.

thbtppl avatar Jul 31 '18 10:07 thbtppl

Are you willing to share the mods you already made? Was there any indication that they improved performance? I could merge them into the main branch if so. I can also take a look at dbtpcf. You're probably right it might require different versions of this (and maybe related routines) for the different dimensions.

jacobwilliams avatar Aug 01 '18 13:08 jacobwilliams

HI @jacobwilliams, apologies I realised I never actually answered your comments! When I did these initial modifications, I had observed a slight degradation in performance. Let me compare my modifications with your new release and I'll get back to you.

thbtppl avatar Nov 19 '19 11:11 thbtppl

Note: a quick test on my laptop shows a speed up from the previous version. Likely due to the others changes I made:

speed_test_oo (Cases/sec)

    v5.4.2     v6.0.0
 1D 15889442   16373765
 2D 3785746    3772725
 3D 906639     996938
 4D 246817     250571
 5D 62335      64395
 6D 14861      16157

jacobwilliams avatar Nov 21 '19 02:11 jacobwilliams

Very good to know! The performance degradation was most likely due to me allocating/deallocating work arrays in db*val each time it's called.

thbtppl avatar Nov 21 '19 09:11 thbtppl