Elimination of redundant memory access and calculations in cell capture
Renamed variables and added comments to make cell capture subroutine more descriptive, and removed/optimized calculation of certain intermediate quantities. Up to a 5% performance improvement by reducing the amount of memory access (particularly for diagonal_length and cell_type)
Luckily we do have a test for this - CAupdate would've failed for the single grain or the directional solidification problem if we messed up (which I did do at one point), but cell capture seems to be working at intended based on the number of time steps it takes to complete the small example problems. I'll definitely want to test a couple large problems on GPU to make sure no weird floating point edge case got introduced though