meshmode
meshmode copied to clipboard
[Direct Connection] Group Contributions (probably) should not be summed
I was looking at the generated expression for the direct connection expression and it is of the form:
_pt_temp_1[idof_ensm0 + 4 * iel_ensm0 + 1024 * iface_ensm0] =
(from_el_present[iel_ensm0 + 256 * iface_ensm0] ?
normal_1_b_all[from_el_indices[iel_ensm0 + 256 * iface_ensm0]] * 0.5 * (_pt_part_ph_id_0[4 * from_el_indices[iel_ensm0 + 256 * iface_ensm0] + _pt_data_3[idof_ensm0 + 4 * iel_ensm0 + 1024 * iface_ensm0]] + -1.0 * _pt_part_ph_id_0[4 * from_el_indices[iel_ensm0 + 256 * iface_ensm0] + _pt_data_3[idof_ensm0 + 4 * iel_ensm0 + 1024 * iface_ensm0]])
: 0.0)
+ (from_el_present_0[iel_ensm0 + 256 * iface_ensm0] ?
_pt_part_ph_id_1[4 * from_el_indices_0[iel_ensm0 + 256 * iface_ensm0] + _pt_data_4[idof_ensm0 + 4 * iel_ensm0 + 1024 * iface_ensm0]] * normal_1_b_face_restr_interior[from_el_indices_0[iel_ensm0 + 256 * iface_ensm0]]
: 0.0)
+ (from_el_present_1[iel_ensm0 + 256 * iface_ensm0] ?
cse[4 * from_el_indices_2[iel_ensm0 + 256 * iface_ensm0] + _pt_data_6[idof_ensm0 + 4 * iel_ensm0 + 1024 * iface_ensm0]] * normal_1_b_BTAG_PARTITION[from_el_indices_2[iel_ensm0 + 256 * iface_ensm0]]
: 0.0);
i.e. it is of the form (A if B else 0) + (C if D else 0) + (E if F else 0), but I think the optimized way of writing this would be A if B else (C if D else (E if F else 0)), notice how this could save us some conditional computation i.e. global memory reads.
On some more thought I think the current way of summing the contributions is too global memory heavy, instead storing the mapping into a single array should be more efficient:
A[.., ...] if which_term[iel,idof]==0 else (B[..., ...] if which_term[iel,idof]==1 else 0)
This should significantly decrease the global memory footprint of the expression. (I think)
I agree the sum is not lovely.
As long as none of the intermediates are materialized, the two things I can see wrong with it are
- The
from_el_presentare likely avoidable - The
from_el_indices_2are bigger than they need to be
Is that your sense as well?