hippunfold icon indicating copy to clipboard operation
hippunfold copied to clipboard

laplace-beltrami occasionally failed

Open jordandekraker opened this issue 5 months ago • 18 comments

one of my 49 subjects failed for one hemisphere, but i can't work out why (yet). The earliest issue seems to be that Laplace-Beltrami for both AP and PD are completely wrong, even though the boundary conditions look good. I did note that there are some small holes in the surface, but i don't think that would produce this result.

AP coords:

Image

AP boundary conditions:

Image

Note that the PD coords also seem to be similarly stratified along the sagittal direction:

Image

jordandekraker avatar Jul 04 '25 16:07 jordandekraker

I've also seen this now in a handful of cases , e.g. HCP subject 129028, happens for left hipp.

Looking at the laplace beltrami logs, seems like it is having a hard time solving, the rank warning doesn't come up in the successful cases..

I also couldn't find anything noticeably wrong in the surface mesh and src/sink inputs, so might need to dig in and see what the matrix it is trying to solve looks like..

/tmp/hippunfold/.snakemake/scripts/tmpfvpi0v5r.laplace_beltrami.py:102: MatrixRankWarning: Matrix is exactly singular
  solution[free_indices] = sp.linalg.spsolve(free_laplacian, free_b)

akhanf avatar Jul 18 '25 02:07 akhanf

ok looking at this case,

  1. the matrix being singular is caught as an exception by the script, setting everything to zeros, otherwise we have all nans (neither approach makes the data usable btw)..
  2. the underlying issue in this case (and my bet for all these cases) are holes in the surface, though it's not clear to me why this causes a singularity??

Note: in wb_view you want to turn on backface culling to see these holes.

Image

akhanf avatar Jul 18 '25 18:07 akhanf

yup, i'd noticed these kinds of holes as well. obviously fixing the source is preferrable to modifying the L-B solver since that will also fix downstream issues with things like thickness calculation.

We could tweak the surface postprocessing, e.g. https://github.com/khanlab/hippunfold/blob/d0dfb6ef73619ae5e34d402d1c0c297d6da872e8/hippunfold/workflow/scripts/gen_isosurface.py#L83)

Or we could also reconsider the cleanJD and cleanAK methods described in https://github.com/khanlab/hippunfold/pull/364. I believe cleanJD guarantees there will be no hole even when a src directly contacts sink (i.e. 0 voxels thick), but masking out the inner/outer from what we intend as midthickness surface is still inelegant. i'll look into whether it would be a worthwhile fix or not

jordandekraker avatar Jul 18 '25 18:07 jordandekraker

I am hoping a mesh-based approach would work as this would be the least radical (the other methods you quote were drastically different mesh construction approaches as I recall).. adjusting the hole fill radius might be an good starting point, maybe doing the hole filling after decimation instead.

This approach looks promising too: https://pymeshfix.pyvista.org/examples/repair_planar.html (see related issue: github.com/pyvista/pymeshfix/issues/13)

akhanf avatar Jul 18 '25 19:07 akhanf

@jordandekraker re: your new PR #503, in my test case with holes (HCP subject 129028, left hipp), this wouldn't solve the problem because the holes are already present in the laminar coords:

image

Ie it is not the dilated nan mask that is introducing any holes.. The nan mask dilation is only performed where src and sink meet as well, so holes on the dorsal surface (which is what I've been seeing) aren't affected by that dilated mask..

The problem seems to really stem from a discretization issue, ie the 0.3mm hipp resolution just isn't sufficient for some cases.. (8 out of 200 hippocampi in my HCPUR100 test). There is already a fix built-in (made for the dentate) that we can try, simply using a higher-resolution for laynii (and subsequent surface generation), by just changing the laminar_coords_res in the config file. Dentate is already 0.1x0.1x0.1 -- my sense is that changing hipp from 0.3x0.3x0.3 to 0.2x0.2x0.2 will likely solve the issue. I'll give this a whirl and see how it goes..

akhanf avatar Jul 19 '25 00:07 akhanf

Maybe there was a miscommunication in the PR - that's exactly what its for (where src-SRLM directly contacts sink-bg)

Either way, its way easier to slap on this patch https://github.com/khanlab/hippunfold/pull/504. It may take some tweaking to get the right hole filling parameters but i'll give it a go

jordandekraker avatar Jul 20 '25 15:07 jordandekraker

Looks like these modifications might need widespread testing for robustness first --

When I tried the above laminar_coords_res modification (changing hipp to 0.1x0.1x0.1), it ends up fixing the holes for my test case, but then overall slightly larger number of subjects end up failing --

Using the #504 patch, it also fixes the holes in the hippocampus, but then the dentate surface ends up having the Laplace solving issue for that same subject. Running with mesh fixing enabled only for hippocampus allowed the subject to run to completion (note I have another patch to fix a separate error that needs to be cherry-picked -- if it's failing in the ants unfolded registration it's because there are extreme values that need to be clipped)

Anyways, I'll try re-running all 100 with pymeshfix only enabled for hipp, and see how that goes..

akhanf avatar Jul 21 '25 03:07 akhanf

btw, I've just added the mentioned commits (the unrelated cherry-pick from another branch, and the flag to make pymeshfix only perform on hipp, not dentate).. this still needs testing on a large number of subjects..

akhanf avatar Jul 21 '25 03:07 akhanf

Just made some updates, and I prefer https://github.com/khanlab/hippunfold/pull/503 as a solution. Let's see if it works in all test cases though.

And yeah setting the laminar_coords_res won't fix it since the hole comes from nnunet. The only way shape_inject would solve this is just by adding more regularization (increasing smoothing). I think that #503 is nice because it generally allows for more imperfections in the nnunet result.

For the cherry-picked metric clipping, we can also discuss. In hippunfold_v1 i had at one point run tanh() on just the curvature metric (this is nice since its easily reversible). If its also needed, we could do gentler clipping of gyrification and thickness at +/-5 with the following:

gyrification_softclip = np.tanh(gyrification/5))*5

jordandekraker avatar Jul 21 '25 16:07 jordandekraker

Sounds good I agree #503 is a good solution (once we confirm it), as it is the least "destructive"..

For posterity - the laminar_coords_res 0.1 approach did work to solve the holes problem in the test case (since it makes use of this upsampled res for the shape injection target, so we ensure the template topology shines through and is not discretized out..) but it is overkill in my opinion and seems to lead to other issues..

Yes for I'm good with tanh as the clipping approach, and I do prefer a gentler clipping to avoid being too heavy-handed, even for curvature.. I am also thinking the unclipped/tanh metrics will be useful for QC purposes (e.g. reporting # of vertices out of range, or a map of them), so I'd like to implement it with an additional (e.g. rule metric_tanh) rule, so we can have access to the unmodified for an eventual QC rule.. should be a separate PR to keep things tidy, will see if I can do that before end of day..

akhanf avatar Jul 21 '25 19:07 akhanf

Ahh-- looks like this doesn't resolve the holes in the test dataset (HCP sub-129028), one hole still remains.. I'm trying now to see whether a combination of upsampled resolution and this patch resolves the issue in this subject, before testing more broadly..

akhanf avatar Jul 22 '25 15:07 akhanf

damn, i'd really hoped this would be a catch-all solution. Is there anything funny about sub-129028 near the hole? e.g. is it near an edge of the surface or is there a cyst underneath or something? perhaps slapping #504 on too would patch it

jordandekraker avatar Jul 22 '25 15:07 jordandekraker

Having a closer look at it, it seems the holes may actually be a red herring (in this case, and perhaps others too) -- looks like the underlying problem is in the src/sink boundary vertex identification.

e.g. You can notice a few stray sink vertices in the src

Image Image

note: here I ran with laminar coords 0.2mm, so it ran to completion, but still ended up with a few misplaced vertices)

Can you confirm in your problem cases that it is the src/sink labels too, and not just the holes?

When we were working on the src/sink labelling there were some additional steps we were thinking of doing to improve robustness (e.g. connected components of the boundary labels) or maybe some otehr kind of regularization, I am thinking we need to do that now..

akhanf avatar Jul 22 '25 19:07 akhanf

correction from my last message -- the holes and src/sink discontinuities are two separate issues, what was confusing things in my test case is that the changes to fix the former also introduced the latter..

Anyways, the src/sink discontinuities could still generally be an issue so I've made that fix too in another PR (#507)

So to summarize, combining the new surface generation approach in #503, along with upsampled coords (0.2mm seems sufficient, CLI required a bugfix merged in #506), and the src/sink contiguity (#507), seems to do the trick.. there is also the tanh normalization in #505.. now I'm going to run that on all 100 before we merge these all in..

akhanf avatar Jul 23 '25 03:07 akhanf

ok some updates so far:

I've run all 100 hcpUR100 subjects with the aforementioned PRs, but with the original 0.3mm resampling, out of these, only 2 structures failed: sub-129028_hemi-L_label-hipp (the one I've been testing with) sub-190031_hemi-L_label-dentate

The run with 0.2mm resampling is still running (I started it later, actually running the above was a mistake forgetting to include the CLI arg, but I let it run as it is a good comparison) -- so far zero errors but only 55% done so far..

akhanf avatar Jul 23 '25 20:07 akhanf

Ok, the run with 0.2mm resampling ended up with just the single error ( sub-190031_hemi-L_label-dentate) as hoped (dentate was unchanged so didn't expect this to be fixed)..

I think we are safe to merge the PRs (all except pymeshfix, which I haven't been using here), and I won't change the default laminar coords res for now, since there is the side effect of having many more native vertices as we increase the resolution.. Perhaps we can revisit once we do some robustness testing on older adults or patient cases with smaller hippocampi..

akhanf avatar Jul 24 '25 12:07 akhanf

awesome! i'm glad we've understood the issue in-depth. I just had a look at #507 and it makes good sense to me.

Thinking more about this too, i'm glad that an error is thrown when the LP-BT coordinates are catastrophically wrong (as long as its a small proportion of failed cases). This might help prevent confusion like we had early on in Nima's project where the outputs were bad but its not obvious to beginners.

Hopefully in the future we see less of these failures due to holes by implementing new segmentation NNs

jordandekraker avatar Jul 24 '25 12:07 jordandekraker

Yes new NNs should hopefully help..

Another addition I was about to make is to check for holes as a QC output -- in the get_boundary_vertices.py script we just about get there, we obtain the largest connected component of boundary vertices as the main boundary. If we simply report the number of connected components (minus 1) we get the number of holes..

Re: src/sink labels, we can also calculate some quality metrics based on final number of connected components, as well as the underlying src/sink sdt values..

Hoping those two will catch most of the errors that pop up..

Finally, at some point I would like to see whether we can really crank up the laminar coords res (eg 50 micron) while adjusting the isosurface generation, to optimize the number of native vertices (to ensure it doesn't blow up) -- that upsampled resolution really only comes into play for laynii and isosurface generation (ie injected registration is still performed at the usual resolution), so there isn't really a downside to heavily upsampling since laynii seems to be quite computationally efficient even with large images..

akhanf avatar Jul 24 '25 12:07 akhanf