CONQUEST-release
CONQUEST-release copied to clipboard
Refactor duplicated code in m_kern_exx_ subroutines
Based on our meeting with @lionelalexandre on April 12, we need to investigate how much we can reduce code duplication in the subroutines
https://github.com/OrderN/CONQUEST-release/blob/c00a071c18a257d4e708abaede7bb4769b80ccbd/src/exx_kernel_default.f90#L921
https://github.com/OrderN/CONQUEST-release/blob/c00a071c18a257d4e708abaede7bb4769b80ccbd/src/exx_kernel_default.f90#L1268
https://github.com/OrderN/CONQUEST-release/blob/c00a071c18a257d4e708abaede7bb4769b80ccbd/src/exx_kernel_default.f90#L1649
https://github.com/OrderN/CONQUEST-release/blob/c00a071c18a257d4e708abaede7bb4769b80ccbd/src/exx_kernel_default.f90#L2019
and do some refactoring before carrying on with the parallelisation work
After comparing m_kern_exx_cri and m_kern_exx_eri, it does not look simple to reduce their duplication through refactoring. Many similar things are happening in both but the loop structures are different. For example in m_kern_exx_cri there is the loop do l = 1, nbnab(k_in_part) and inside this loop K_val and Phy_k are set before the loop is closed. However, in m_kern_exx_eri this same loop contains the setting of K_val as well as the rest of the kernel.
Further down the loop nest, the loop structure differs again. In m_kern_exx_cri there are two nested loops over kg%nsup and jb%nsup but in m_kern_exx_eri the loop is not nested and is only over jb%nsup.
After comparing m_kern_exx_eri and m_kern_exx_eri_gto, the only differences seem to be that arrays are allocated in m_kern_exx_eri but not m_kern_exx_eri_gto and that m_kern_exx_eri_gto calls compute_eri_hoh at it's deepest point whereas m_kern_exx_eri calls exx_ewald_charge and exx_v_on_grid and performs calculations in a triple nested loop.
Therefore, it should be possible to reduce code duplication between these two kernels by first moving array allocations in m_kern_exx_eri outside of the duplicated region.
yes, this is correct ; I should have done this before...
m_kern_exx_eri and m_kern_exx_dummy are also very similar with the main difference being that arrays are allocated at the beginning of m_kern_exx_dummy. Therefore, this can also likely be refactored to deduplicate code.
m_kern_exx_eriandm_kern_exx_dummyare also very similar with the main difference being that arrays are allocated at the beginning ofm_kern_exx_dummy. Therefore, this can also likely be refactored to deduplicate code.
I am not sure but you will need to had a if statement in the nested loops. Can it be a problem ?
After comparing
m_kern_exx_eriandm_kern_exx_eri_gto, the only differences seem to be that arrays are allocated inm_kern_exx_eribut notm_kern_exx_eri_gtoand thatm_kern_exx_eri_gtocallscompute_eri_hohat it's deepest point whereasm_kern_exx_ericallsexx_ewald_chargeandexx_v_on_gridand performs calculations in a triple nested loop.Therefore, it should be possible to reduce code duplication between these two kernels by first moving array allocations in
m_kern_exx_erioutside of the duplicated region.
yes, this is correct ; I should have done this before...
After comparing
m_kern_exx_criandm_kern_exx_eri, it does not look simple to reduce their duplication through refactoring. Many similar things are happening in both but the loop structures are different. For example inm_kern_exx_crithere is the loopdo l = 1, nbnab(k_in_part)and inside this loopK_valandPhy_kare set before the loop is closed. However, inm_kern_exx_erithis same loop contains the setting ofK_valas well as the rest of the kernel.Further down the loop nest, the loop structure differs again. In
m_kern_exx_crithere are two nested loops overkg%nsupandjb%nsupbut inm_kern_exx_erithe loop is not nested and is only overjb%nsup.
I believe the code inside the do nsf2 = 1, jb%nsup loop is the same for m_kern_exx_cri and m_kern_exx_eri. Therefore, this should be able to be pulled out into a single subroutine.
Closed by #352