Peter Ukkonen
Peter Ukkonen
Not at all, it was quite hidden in the pull request topics. Would be great if more people could test the fix! It sounds like Robert will put it in...
I think it's a good idea, at least as a temporary solution, so people can run the code in single precision. I tested the fix also with ecRAD and it...
Robin Hogan and I spent some time looking at this issue with inaccuracies in shortwave reflectance-transmittance computations in single precision. I don't remember all the ins and outs of it...
Hello! I would like to use the mixed precision procedure "bli_gemm" in my Fortran code. Is this not possible yet?
`gfortran -ffree-line-length-none -std=f2008 -pg -march=native -O3 --fast-math ` Elapsed time on output: 0.939000010 Elapsed time on output_opt_sig: 0.680999994 Elapsed time on output_flatmodel_opt_sig: 0.669000030 `gfortran -ffree-line-length-none -m64 -std=f2008 -march=native -O3 --fast-math...
On ifort the differences are much bigger: ` ifort -O3 -mkl` Elapsed time on output: 0.8096000 Elapsed time on output_opt_sig: 0.1968000 Elapsed time on output_flatmodel_opt_sig: 0.1787000 Elapsed time on output_sgemm_sig:...
`output_opt_sig` and `output_sgemm_sig` do not assume equal-sized hidden layers (I tested that they work). As you can see, the former is almost as fast as the flatmodel one. Processing inputs...
> The only opportunity I see is to allocate the activation arrays once rather than re-allocate on assignment in every iteration, in which case the subroutine to calculate activations in-place...
You're welcome! I have now implemented these changes in a more general manner here: https://github.com/peterukk/rte-rrtmgp/blob/master/neural/ Most notably, activation functions have been replaced with subroutines, and these procedures have a 2D...
FYI I now tested using elemental subroutines as activation functions. This is considerably faster (and work with both 1D and 2D arrays), but unfortunately, pointers do not work with elemental...