MCP max_cost and max_cummulative_cost parameters do nothing
Description:
I was hoping to speed up MCP by confining the paths using max_cost and max_cumulative_cost. Neither parameter displays the expected effects (output is not effected by either).
I expect the max_cost to either a) clip the cost to some maximum value or b) do not allow paths through nodes with a cost value above max_cost. The latter ofcourse being the most logical, as I could clip the input myself. I expect the max_cumulative_cost to restrict paths up to some total cost. I can simply apply the threshold afterwards, but I expect the algorithm to converge quicker if this is taken into account.
Way to reproduce:
# original code from Juan: https://stackoverflow.com/questions/62135639/mcp-geometrics-for-calculating-marketsheds
import numpy as np
from skimage import graph
from skimage.graph import _mcp
from scipy import sparse
image = np.array(
[[1, 1, 2, 2],
[2, 1, 1, 3],
[3, 2, 1, 2],
[2, 2, 2, 1]]
)
destinations = [[0, 0], [3, 3]]
mcp = graph.MCP(2*image)
costs, traceback = mcp.find_costs(destinations, max_cumulative_cost=3., find_all_ends=False)
print(costs)
costs, traceback = mcp.find_costs(destinations, max_cost=3., find_all_ends=False)
print(costs)
# both return:
#[[ 2. 4. 8. 10.]
# [ 6. 4. 6. 10.]
# [10. 8. 4. 6.]
# [12. 8. 6. 2.]]
# Finding the corresponding endpoints:
offsets = _mcp.make_offsets(2, True)
offsets.append(np.array([0, 0]))
offsets_arr = np.array(offsets)
indices = np.indices(traceback.shape)
offset_to_neighbor = offsets_arr[traceback]
neighbor_index = indices - offset_to_neighbor.transpose((2, 0, 1))
ids = np.arange(traceback.size).reshape(image.shape)
neighbor_ids = np.ravel_multi_index(
tuple(neighbor_index), traceback.shape
)
g = sparse.coo_matrix((
np.ones(traceback.size),
(ids.flat, neighbor_ids.flat)),
shape=(ids.size, ids.size),
).tocsr()
n, components = sparse.csgraph.connected_components(g)
basins = components.reshape(image.shape)
print(basins)
# both (or without max_cost/max_cumulative_cost) return
#[[0 0 0 0]
# [0 0 0 1]
# [0 0 1 1]
# [1 1 1 1]]
### Version information:
```Shell
3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
Linux-6.1.85+-x86_64-with-glibc2.35
scikit-image version: 0.24.0
numpy version: 1.26.4
Looking at the code, it seems to me that the parameters max_cumulative_cost and max_cost of find_costs are not used at all. Furthermore, the parameter max_coverage is missing documentation.
https://github.com/scikit-image/scikit-image/blob/707d937c43041737b562c3fb8d1fa88a5d7e13af/skimage/graph/_mcp.pyx#L395L396
The parameters have been introduced with the pull request https://github.com/scikit-image/scikit-image/pull/854. The pull request was quite big so it might just have been forgotten.
@stefanv Do you still remember anything from this pull request and whether there was any intention behind it? I know it was already 10 years ago 😲
Hrm, wow, no but maybe @almarklein does 😅
Looking at the history of _mcp.pyx, it looks like I added the args to find_cost() here, but never implemented them.
From what I remember, I had an implementation of the MCP algorithm that I had used locally in my research. I refactored it a bit as I added the code to skimage. I think that these args represent a feature that I had in my own version, but these were not really necessary because the new class had hooks to implement things like that.
As to their intended meaning, I think max_cost would be a cost above which the front would refuse to move to that voxel. It allows you to basically create walls. The max_cumulative_cost was likely a way to stop the front from growing when it failed to find an end-point. This can be achieved by overloading goal_reached().
Funny how this was found only now. I'm sorry if I confused users by having introduced arguments that do nothing.
@helvince everyone else, thanks for catching and diagnosing this. It sounds unlikely that anyone is planning to fill the unused parameters with life anytime soon. So I suggest to just deprecate them.
[...] because the new class had hooks to implement things like that.
@almarklein, that sounds intriguing. In case anyone is interested in things like that, how would they go about using these class hooks? A few pointers could be helpful in documenting this or even implementing this if someone wants to add that feature. :)
There's goal_reached, examine_neighbor, update_node, travel_cost, create_connection. They are in principal documented: https://scikit-image.org/docs/stable/api/skimage.graph.html, just search for "overload". But perhaps this can be made more explicit, e.g. the class docstring can list the methods that can be overloaded.
As to their intended meaning, I think
max_costwould be a cost above which the front would refuse to move to that voxel. It allows you to basically create walls.
It seems to me that this condition can be easily checked right here to skip that voxel:
https://github.com/scikit-image/scikit-image/blob/3a82599073a9272cab0c370eccf678210a59f709/skimage/graph/_mcp.pyx#L590
@almarklein Do you agree?
I can open a merge request to implement max_cost and to deprecate max_cumulative_cost because it is not needed.
Yeah, probably just replace new_cost == inf with new_cost > max_cost.
I opened a merge request for the discussed changes https://github.com/scikit-image/scikit-image/blob/a854ec56aeea68f0b45c6b1ff835fdd950bcb458/skimage/graph/_mcp.pyx#L612
What still irritates me is that find_costs returns the internal state of cumulative_costs and traceback and not a copy. I think we should change np.asarray to np.array in:
https://github.com/scikit-image/scikit-image/blob/a854ec56aeea68f0b45c6b1ff835fdd950bcb458/skimage/graph/_mcp.pyx#L612
@almarklein Is there a special reasoning behind not copying the cumulative_costs and traceback? I would assume to always get a new object.
What still irritates me is that find_costs returns the internal state of cumulative_costs and traceback and not a copy.
That does seem weird. It makes sense that the method modifies the attribute in place but might be a different story for arrays returned by the method. Doesn't seem like a case that should be especially memory sensitive.
@almarklein Is there a special reasoning behind not copying the cumulative_costs and traceback?
I don't remember, maybe performance. If returning a copy (or a readonly view?) is more skimage-like, go ahead! 👍
@almarklein Thanks for the info.
Another question, the parameter max_coverage is missing documentation
https://github.com/scikit-image/scikit-image/blob/aa7042ccd227e7f49b73ac466542088ec5a9ad63/skimage/graph/_mcp.pyx#L396
I wanted to add some but it seems like it was only useful while developing:
https://github.com/scikit-image/scikit-image/blob/aa7042ccd227e7f49b73ac466542088ec5a9ad63/skimage/graph/_mcp.pyx#L492
@almarklein Do you agree that we can deprecate that one, too? Or might it be useful for some users?
I guess it could still be useful, e.g. if you know that the explored area should never make up more than 20% of the total image, it allows you to stop failure-cases 5x faster. Definitely a bit of a niche, but you never know if someone is using it, and the cost for maintaining is is near-zero 🤷
near-zero
Well, except for writing a docstring 😉
Hello scikit-image core devs! There hasn't been any activity on this issue for more than 180 days. I have marked it as "dormant" to make it easy to find.
To our contributors, thank you for your contribution and apologies if this issue fell through the cracks! Hopefully this ping will help bring some fresh attention to the issue. If you need help, you can always reach out on our forum
If you think that this issue is no longer relevant, you may close it, or we may do it at some point (either way, it will be done manually).