skan icon indicating copy to clipboard operation
skan copied to clipboard

Potential problem in branch-type classification of cycle

Open kushaangupta opened this issue 3 years ago • 6 comments

The following minimal skeleton is not being classified as cycle in summary. I was wondering if it's a problem or a compromise of the current algorithm for branch-type classification.

import numpy as np
import matplotlib.pyplot as plt
from skan import csr, draw
skel = csr.Skeleton(np.array([
    [0, 1, 0],
    [1, 0, 1],
    [0, 1, 1]
]))
fig, ax = plt.subplots(figsize=(4,4))
draw.overlay_skeleton_networkx(skel.graph, skel.coordinates, axis=ax)
csr.summarize(skel)

90d86b01-25f0-47e9-993f-3e662324e249

skeleton-id node-id-src node-id-dst branch-distance branch-type mean-pixel-value stdev-pixel-value image-coord-src-0 image-coord-src-1 image-coord-dst-0 image-coord-dst-1 coord-src-0 coord-src-1 coord-dst-0 coord-dst-1 euclidean-distance
0 2 3 4.242641 2 1.0 0.0 1 2 2 1 1 2 2 1 1.414214
0 2 3 1.414214 2 1.0 0.0 1 2 2 1 1 2 2 1 1.414214
0 2 3 2.000000 2 1.0 0.0 1 2 2 1 1 2 2 1 1.414214

At present only paths with src == dst are classified as cycles, but what about the case where two paths have same src, dst nodes?

kushaangupta avatar Mar 22 '22 09:03 kushaangupta

@kushaangupta can you please include complete imports, screenshots, and outputs when making issues?

At any rate, wow that is not what I expected to see:

   skeleton-id  node-id-src  node-id-dst  branch-distance  ...  coord-src-1  coord-dst-0  coord-dst-1  euclidean-distance
0            0            2            3         4.242641  ...            2            2            1            1.414214
1            0            2            3         1.414214  ...            2            2            1            1.414214
2            0            2            3         2.000000  ...            2            2            1            1.414214

What I expected to happen is that the "junction" at the bottom right would be cleared by the MST method, and then there would be a single loop, going from 4 to 4. So this indeed appears to be a major bug. I don't think it's at the classification step, but at the graph simplification and traversal steps.

jni avatar Mar 22 '22 12:03 jni

can you please include complete imports, screenshots, and outputs when making issues?

Done

I don't think it's at the classification step, but at the graph simplification and traversal steps.

I understand. So, is the problem in scipy.sparse.csgraph.minimum_spanning_tree?

kushaangupta avatar Mar 22 '22 13:03 kushaangupta

So, is the problem in scipy.sparse.csgraph.minimum_spanning_tree

More likely, in how we use it, here:

https://github.com/jni/skan/blob/f606888282bc255ee3d816b197927721def8732f/src/skan/csr.py#L774-L814

jni avatar Mar 23 '22 01:03 jni

This function performs the operation in place.

:point_up: that's super-suspicious :joy: I wonder if that factor, combined with the new traversal algorithm (#152) have created an error. If I remember correctly, the algorithm sets the entries to 0. But it might actually need to delete the entries (relatively expensive, as it requires copying the whole csr graph), or we need to change the algorithm to ignore 0 entries.

jni avatar Mar 23 '22 01:03 jni

Actually, I think copying the graph is better than having an additional if expression for every edge.

jni avatar Mar 23 '22 02:03 jni

@kushaangupta and I just had a meeting where we tried to hash this out, and we realised that this is not really a bug at all, it's only a bug in our expectations of what should happen in such a graph. In fact, the node 4 is not a junction node, so skan's current interpretation is correct: there are two junction nodes, 2 and 3, and there are three ways to get from one to the other: via the direct edge, via 0 and 1, and via 4.

One implication of this is that the junction graph will need to support being a multi-graph. This is difficult/impossible with a csr graph. 😞

jni avatar Mar 29 '22 08:03 jni