pyxem Add center-of-mass algorithm to get_direct_beam

name: Center-of-mass in get_direct_beam_position about: Adds center-of-mass algorithm to get_direct_beam_position for Diffraction2D

Checklist

[x] Updated CHANGELOG.md
[x] (if finished) Requested a review (from pc494 if you are unsure who is most suitable)

What does this PR do? Please describe and/or link to an open issue. This adds the center-of-mass algorithm to get_direct_beam_position in Diffraction2D. It reuses the center_of_mass method in Diffraction2D, then adjusts the result to comply with the existing get_direct_beam_position return values. Also adds a test for the functionality.

Describe alternatives you've considered To reimplement center-of-mass or use a method outside the class. However, it is more efficient to reuse existing functionality.

Are there any known issues? Do you need help?

Work in progress? If you know there is more to do, make a checklist here:

May 26 '22 16:05 sivborg

For reviewers of this, the failing tests are unrelated to this PR.

May 30 '22 20:05 hakonanes

@sivborg thanks for the addition. I've looked through the code and it seems fairly self explanatory. I might take another pass later tonight.

Using the COM to center the direct beam is possible but was curious as to the situation where this might be better than the cases already provided? Might be something that we could document a little bit better. In general the doc strings are pretty sparse for the get_direct_beam_position function so it might be good to add something like this:

method : str,
            Must be one of "cross_correlate", "blur", "interpolate" or "center_of_mass".

           "cross_correlate": Center finding using cross-correlation of circles of `radius_start` to `radius_finish`
           "blur": Center finding by blurring each frame with a Gaussian kernel with standard deviation `sigma` and finding the maximum
           "interpolate":  Finding the center by summing along X/Y and finding the peak for each axis independently. Data is blurred first using a Gaussian kernel with standard deviation "sigma"
           "center_of_mass": The center is found using a calculation of the center of mass.  Optionally a `mask` can be applied to focus on just the center of some dataset.

It might be also good to include something about speed of each of them. I image each of these is pretty fast but I haven't done a great comparison ever.

@magnunor I would imagine that changing the COM to a map_iterate like function would slow things down pretty sufficiently? That being said it might be good to write it in the scope of the map function. I think the function could easily take an axes parameter and then use themap function but then call the _map_all function in hyperspy.

https://github.com/hyperspy/hyperspy/blob/5f15a5497d27718adda17a7e89b79fff88207e69/hyperspy/signal.py#L4919-L4936

The underlying function wouldn't change any but then the function could be called using the map function.

It might just clean up some of the existing code making things a little more consistent across the board. That being said, it might be more trouble than it is worth at the moment.

May 31 '22 18:05 CSSFrancis

Using the COM to center the direct beam is possible but was curious as to the situation where this might be better than the cases already provided? Might be something that we could document a little bit better. In general the doc strings are pretty sparse for the get_direct_beam_position function so it might be good to add something like this:

Yeah, I agree. As you say, currently (based on the docstring) it is a bit difficult to figure out which one to use. I use CoM for pretty much all my "beam centering" needs, but in the current version it is a bit of a hassle.

@magnunor I would imagine that changing the COM to a map_iterate like function would slow things down pretty sufficiently? That being said it might be good to write it in the scope of the map function. I think the function could easily take an axes parameter and then use themap function but then call the _map_all function in hyperspy.

This specific function is implemented using dask functions directly (https://github.com/pyxem/pyxem/blob/master/pyxem/utils/dask_tools.py#L912), which I think would be superior to using map in most situations. However, I haven't really tested this.

Jun 01 '22 07:06 magnunor

This specific function is implemented using dask functions directly (https://github.com/pyxem/pyxem/blob/master/pyxem/utils/dask_tools.py#L912), which I think would be superior to using map in most situations. However, I haven't really tested this.

Yes I would agree that is the case. Hyperspy does allow these kind of direct dask functions called from the map function. If you pass an axes parameter it will call map_all rather than map_iterate. Map_all basically does what you are doing here which calls a function on the whole dataset and let's dask/numpy handle everything.

Jun 01 '22 11:06 CSSFrancis

I just tested switching things over to map and it seems like there are a couple of bugs I might try to sort out in hyperspy before trying to make the switch to using the map function. See https://github.com/hyperspy/hyperspy/issues/2951 if you are interested.

I think that this PR is good to go if some additional information is added to the doc strings and then we can merge things.

I'll work on the upstream code when I get the chance and hopefully we can eventually clean things up downstream.

Jun 01 '22 19:06 CSSFrancis

@CSSFrancis thank you for the thorough feedback!

Just to get an idea of the different choices, I did some basic profiling on an Ubuntu machine by running each algorithm 20 times (including a copy) with a navigation space of 11x11 and a varying signal space length and width: download In these tests, center-of-mass in a quite efficient algorithm, but the outlier here is definitely cross_correlation. I am not sure whether this is worth mentioning in the doc string.

I have pushed the more descriptive doc string, and I also removed the progressbar to make it consistent with the other algorithms. But now you cannot use keyword arguments to make it reappear, if the user wished to.

Jun 02 '22 09:06 sivborg

@magnunor I have corrected the center_of_mass mask functionality when using half_square_width, although perhaps using both at the same time would be somewhat redundant. This was done by directly manipulating the kwargs dictionary as it gave me the least trouble, but let me know if that's not the preferred way. I also reformatted the docstring.

Jul 08 '22 08:07 sivborg

I'm going to merge this for now! I've continued some changes that I wanted to add to the center_direct_beam function in #886.

Jan 09 '23 17:01 CSSFrancis

Coverage: 96.014% (-0.02%) from 96.03% when pulling 452f636377097cd1a6d69c204c872e5bb4902825 on sivborg:decorator into b55e54b491cba86a22622a4b770a25125a4c05e6 on pyxem:master.

Jan 09 '23 18:01 coveralls

pyxem
pyxem copied to clipboard

Add center-of-mass algorithm to get_direct_beam_position

pyxem pyxem copied to clipboard

Add center-of-mass algorithm to get_direct_beam_position

pyxem
pyxem copied to clipboard