Sriharsha Kandala
Sriharsha Kandala
> @sriharshakandala I left a few additional comments on the other (merged PR). Could you incorporate them here and try out how it affects results? Sure!
Replaced by https://github.com/CliMA/RRTMGP.jl/pull/421
Does this improve performance?
Current `main`: ``` julia --project=gpuenv test/all_sky_tuning.jl device = ClimaComms.CUDADevice(); FT = Float64, ncols = 131658; size per field = 0.04119899868965149 GB "timing longwave solver" = "timing longwave solver" 1.159210 seconds...
Targets for all-sky problem: RRTMGP shortwave solve time on (1) V100 for 131,658 cols at 0.98 sec RRTMGP longwave solve time on (1) V100 for 131,658 cols at 0.85 sec...
Current status: We are currently at 1.05 sec for the shortwave solver with 131,658 cols on a A100 GPU
Update `OneScalar` and `TwoStream` structs https://github.com/CliMA/RRTMGP.jl/pull/434
Restructure `Source` structs https://github.com/CliMA/RRTMGP.jl/pull/436
Simplify type parameters for `AtmosphericState` https://github.com/CliMA/RRTMGP.jl/pull/437
Always inline and remove use of `StepRange` https://github.com/CliMA/RRTMGP.jl/pull/438