Potential problem in branch-type classification of cycle
The following minimal skeleton is not being classified as cycle in summary. I was wondering if it's a problem or a compromise of the current algorithm for branch-type classification.
import numpy as np
import matplotlib.pyplot as plt
from skan import csr, draw
skel = csr.Skeleton(np.array([
[0, 1, 0],
[1, 0, 1],
[0, 1, 1]
]))
fig, ax = plt.subplots(figsize=(4,4))
draw.overlay_skeleton_networkx(skel.graph, skel.coordinates, axis=ax)
csr.summarize(skel)

| skeleton-id | node-id-src | node-id-dst | branch-distance | branch-type | mean-pixel-value | stdev-pixel-value | image-coord-src-0 | image-coord-src-1 | image-coord-dst-0 | image-coord-dst-1 | coord-src-0 | coord-src-1 | coord-dst-0 | coord-dst-1 | euclidean-distance |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 3 | 4.242641 | 2 | 1.0 | 0.0 | 1 | 2 | 2 | 1 | 1 | 2 | 2 | 1 | 1.414214 |
| 0 | 2 | 3 | 1.414214 | 2 | 1.0 | 0.0 | 1 | 2 | 2 | 1 | 1 | 2 | 2 | 1 | 1.414214 |
| 0 | 2 | 3 | 2.000000 | 2 | 1.0 | 0.0 | 1 | 2 | 2 | 1 | 1 | 2 | 2 | 1 | 1.414214 |
At present only paths with src == dst are classified as cycles, but what about the case where two paths have same src, dst nodes?
@kushaangupta can you please include complete imports, screenshots, and outputs when making issues?
At any rate, wow that is not what I expected to see:
skeleton-id node-id-src node-id-dst branch-distance ... coord-src-1 coord-dst-0 coord-dst-1 euclidean-distance
0 0 2 3 4.242641 ... 2 2 1 1.414214
1 0 2 3 1.414214 ... 2 2 1 1.414214
2 0 2 3 2.000000 ... 2 2 1 1.414214
What I expected to happen is that the "junction" at the bottom right would be cleared by the MST method, and then there would be a single loop, going from 4 to 4. So this indeed appears to be a major bug. I don't think it's at the classification step, but at the graph simplification and traversal steps.
can you please include complete imports, screenshots, and outputs when making issues?
Done
I don't think it's at the classification step, but at the graph simplification and traversal steps.
I understand. So, is the problem in scipy.sparse.csgraph.minimum_spanning_tree?
So, is the problem in scipy.sparse.csgraph.minimum_spanning_tree
More likely, in how we use it, here:
https://github.com/jni/skan/blob/f606888282bc255ee3d816b197927721def8732f/src/skan/csr.py#L774-L814
This function performs the operation in place.
:point_up: that's super-suspicious :joy: I wonder if that factor, combined with the new traversal algorithm (#152) have created an error. If I remember correctly, the algorithm sets the entries to 0. But it might actually need to delete the entries (relatively expensive, as it requires copying the whole csr graph), or we need to change the algorithm to ignore 0 entries.
Actually, I think copying the graph is better than having an additional if expression for every edge.
@kushaangupta and I just had a meeting where we tried to hash this out, and we realised that this is not really a bug at all, it's only a bug in our expectations of what should happen in such a graph. In fact, the node 4 is not a junction node, so skan's current interpretation is correct: there are two junction nodes, 2 and 3, and there are three ways to get from one to the other: via the direct edge, via 0 and 1, and via 4.
One implication of this is that the junction graph will need to support being a multi-graph. This is difficult/impossible with a csr graph. 😞