pyxem
pyxem copied to clipboard
pyxem-1.0.0
I was discussing with @pc494 that the overall architecture of pyxem has now settled down quite a lot and this may mean we are now getting towards the stage where releasing pyxem-1.0.0
should be on the roadmap.
There will be some more minor releases before then, probably at least v0.13.0
and v0.14.0
, and that these would deal with currently open issues/PRs including adding XRD support and tidying up quite many things. There is not very much that is qualitatively new open right now though so the idea would be to consolidate all the existing workflows and refine things ready for v1.0.0
.
I would expect that this therefore would happen at some point in 2021.
Does anyone have any comments or objections to working towards this?
Sounds good!
I'll have quite a bit more time to contribute when the teaching settles down here, especially to the tidy up part :)
I think that is a good idea to work towards. I should have some more time coming up soon as well. It might be worth thinking about what we need to do before that release. I'd say that we need to work on some documentation and maybe publish a paper alongside the release if that is still the goal?
I think that 1.0.0
would be accompanied by a paper, yes. Having an agreed documentation strategy would be good too, I'll raise a discussion issue.
One of the goals of Semantic Versioning, is that the API should not change within a major release. So within 1.x.x, the public "front end" functions which the users are expected to interact with should ideally not change. Getting to that stage with pyxem
would be really nice, and I do agree that having a stable, non-breaking, API is very good!
However, I think we're still a bit away from that, since currently I don't feel the different functionalities have a coherent "user experience". I think this stems partly from different design preferences between pyxem
and pixstem
.
I don't think this is something which is very difficult to sort out, but will require a little bit of thinking on how to best design a coherent "user experience". I think it should be fairly close to how HyperSpy
does things, so that users can fairly seamlessly move between the different libraries.
I'm thinking about doing a pretty large clean-up of (especially) the functionality from pixstem
, as soon as the map
functionality has been improved in HyperSpy (https://github.com/hyperspy/hyperspy/pull/2703). Essentially making most of the functions rely on s.map
.
It is very tempting to do the coding related to the aforementioned "user experience" changes simultaneously or after the changes connected to s.map
. However, the "design principles" can be agreed on before the actual coding part.
So there are several things which should be done:
- Figure out which function "user designs" are the best. Should
inplace
be default? How shouldlazy_results
be handled, what should the default be? ...? - Figure out which functions should be part of the "public" API. For example, a number of functions (especially in
radial.py
) could probably be private. - Sort the functions into a coherent structures. Should users have to import the commonly used functions, or should they be available directly via
import pyxem as pxm
? - Maybe the best way of doing this, is finding common data processing "pipelines", and trying to make these work fairly smoothly?
- Prune duplicate functions. First step: add
DeprecationWarning
. - One specific functionality to sort out is the peak finding, which currently have two different implementations. Including a reworking of
DiffractionVectors
?
I think all of this can be done incrementally, inching towards the 1.0.0 version number.
Also, a bigger question is the HyperSpy 2.0.0 release. I'm not sure when that is likely to be released. In theory it might be nice for that to be released first, but due to the uncertain timeline, I'm not sure if it is a good idea to wait.
Just to mention a couple of other things that should be consistent:
- Consistent Masking (are 0's or 1's ignored)
- Elliptical distortion as affine transformations rather than major,minor, rotation
My feeling is that 1.0.0
is a huge undertaking, and we would be best to slowly improve the code base, all the while being more careful with our design decisions. Hopefully those actions over time will reduce the total workload. In the short term the new map
implementation is going to dramatically reduce the volume of code we need to keep track of and that's likely to be our next step.
Planning for a Release 1.0 I feel like much of the API has been stabilizing within the last year or two to the point where a solid 1.0 release should be possible and worth the effort. I have some time coming up in the next month or two and hope to clean up a lot of my code and contribute what I can to move this along.
It should also reduce the maintenance that the package requires going forward.
Describe the solution you'd like
I think that there are a few things that I would like to clean up before the release, namely:
- [ ] Improving the Calibration options #830
- [ ] Improve diffraction vectors #https://github.com/hyperspy/hyperspy/pull/2876 (and a refactoring of the diffraction vectors class in pyxem)
- [ ] Removing some generators. #776
- [ ] Remove old map and plotting code #759 #721
Describe alternatives you've considered In addition it would be nice to publish something along side the 1.0 release. There are a lot of people who don't spend quite as much time online who I think miss many of the good features of pyxem/hyperspy. For example, the map function is often not fully appreciated and it would be good to describe what sets us apart from other packages out there.
We could work towards a 0.15 release with a 1.0 release after that as another option, but I would rather focus on getting to a 1.0 release by the end of the year if possible rather than trying to do this incrementally and not fixing some of the bigger issues in the package currently.
Additional context I'd love to hear any input that people might have. I realize this might be a fairly large undertaking but I would like to commit to at least trying to finish this before the end of this year.
I can contribute improvements to the current documentation, as it is lacking. The changes I'd like to make I've already done for kikuchipy (link to the docs):
- Automatic generation of API reference (not listing things manually as is done now) following numpydoc, with one page per module (overview), class (overview), function/method, and attribute. Easier to read, maintain, and extend.
- Separate the current user guide into Tutorials built from Jupyter notebooks using nbsphinx (same as before) and Examples built as a gallery from scripts. Tutorials are in-depth guides (existing user guides?) while examples are short recipies to common tasks (like scikit-image's gallery or Matplotlib's examples). This structure is an attempt to follow the Diátaxis framework for documentation. I believe this structure will make it easier to extend the documentation, typically with more examples.
I think a complete API reference is required for a 1.0 as it will be easier to identify when to raise deprecation warnings, which should be used more in pyxem.
@hakonanes This sounds like an excellent idea. I'm already quite jealous of kikuchipy's documentation and was definitely something I forgot to include before the 1.0 release. The Diátaxis framework seems like a good framework to work around.
I would also like to clean up the notebooks and add in some extra workflows for amorphous materials as well as add some information about how to run on a cluster/ utilize a different dask scheduler.
I am happy to contribute, will get #865 done as a first step. My only broad brush input would be that we should cut stale code fairly aggressively.
I am happy to contribute, will get #865 done as a first step. My only broad brush input would be that we should cut stale code fairly aggressively.
I would agree with that. Do we want to start with releasing #865 and then we can think about creating a 1.0 release branch?
We should probably have a 0.15 release to deprecate some of the code that is being removed but that should be fairly easy to remove completely when it comes to it. For now we might consider creating issues for which code should be removed or refactored.
Yup, I'm in favour of a 0.15 and 1.0.0 at as close as feasibly possible to one another.
Just another point that I would like to remove the io-utils code in favor of moving all of the loading/saving to Hyperspy/rosettasciio. The benefit is that that package can have much more consistent releases than us and should have faster implementations.
Just another point that I would like to remove the io-utils code in favor of moving all of the loading/saving to Hyperspy/rosettasciio. The benefit is that that package can have much more consistent releases than us and should have faster implementations.
Sounds good. Would the plan be to deprecate it all in 0.15.0
and have it gone by 1.0.0
?
Sounds good. Would the plan be to deprecate it all in
0.15.0
and have it gone by1.0.0
?
I think that depends on the time frame for release. I think if hyperspy
moves their loading to rosettasciio
before then I would argue for that. Otherwise we can make a small release after
I suggest we now close this issue and create a small number of separate, 'getting it done' issues, eg:
- Issues that need to be fixed by 1.0
- Code that needs to be removed by 1.0
- Documentation that needs to come in for 1.0
- Enhancements that are needed by 1.0
In a hope of making keeping track a little easier.
I have opened a new branch for a 1.00.0 RELEASE_next_major
. My hope is to release at the same time that hyperspy releases a 2.00.0 release as many of the improvements are dependent on that release.
Great! I think that's a clean way of taking the leap to a major version.
Thanks for making a push to streamline parts of pyxem, @CSSFrancis!
My hope is to release at the same time that hyperspy releases a 2.00.0
Do you or anyone else have an idea of when a 2.0 will be released? I need to update parts of some of the readers in kikuchipy for this...
Just scoping a couple of things for a 1.0.0 release now that hyperspy 2.0.0 has been released:
- [ ] Clean up Documentation:
- [x] #957 & #956
- [x] Add Additional Examples
- [ ] Add Real Strain Mapping Dataset
- [ ] Clean up Orientation Mapping (Correlation Based)
- [x] Use PolarSignal rather than doing unwrapping in function
- [x] Add unwrap polar function which uses CuPy in addition to the PyFAI implementation. #989
- [ ] Clean up simulation generation
- [ ] https://github.com/pyxem/diffsims/pull/201
- [x] https://github.com/pyxem/orix/pull/460
- [ ] Make it so the orientation mapping can use
dask-distributed
- [x] Add azimuthal Integral Method which can use cupy.
- [x] Use PolarSignal rather than doing unwrapping in function
- [ ] Improve Vector functionality
- [x] #969
- [x] #972
- [ ] #980
- [x] #981
- [ ] #899
- [ ] Fix up Radial/ ellipse tools
- [x] #969
- [ ] Remove excess code and simplify methods
- [ ] Other things worth doing:
- [x] Speed up Creation of VDF images
- [ ] Clean up CalibrationGenerator
- [ ] Removals
- [ ] Subpixel Generator --> Function in DiffractionVectors
- [ ] Variance Generator --> Function in Diffraction2D
- [ ] Virtual Image Generator --> Function in Diffraction2D
- [ ] DisplacementGradientTensor Generator --> Move to DiffractionVectors
- [ ] OrientationMapping --> Move to PolarDiffraction2D
Some of these will be easy additions with 0.17.0 but some will need to be done later with the jump to 1.0.0
I'd really like to do some improvement of the DPC parts for 1.0.0, specifically to streamline the processing and structure a bit.
This would require a bit of thinking what an optimal class structure would be.
I'd really like to do some improvement of the DPC parts for 1.0.0, specifically to streamline the processing and structure a bit.
This would require a bit of thinking what an optimal class structure would be.
@magnunor When do you think you might have time for this? It would be nice to get a 1.0.0 release sooner rather than later as we have been held up on this for quite some time. I can help somewhat if you have an idea of how you want things to look. We can make the BeamShift
class a subclass of DiffractionVectors1D
which might work out nicely. It would be good to have some of the methods like converting to markers etc. for the beam shift class.
Then maybe the workflow should look like:
beam_center = dp.get_direct_beam_position() # returns BeamShift
dpc_signal = beam_center.T #Gives Differential Phase contrast
dpc_signal.T # Gives beam shift
@CSSFrancis, my time is unfortunately quite limited, but I'll try to squeeze in time for this. As I agree that it would be really nice to both get this implemented for pyxem 1.0.0, and to have pyxem 1.0.0 released soonish.
I think the first step would be to figure out what is the best class "structure", class names and processing "pipeline". I'll make a new Issue to discuss this. I don't think that part of it (the API breaking) should require that much new coding, as it is mostly what things are named, and what type of signals they return.
@magnunor Sounds good! I think that the 1.0.0 release should be more of a "reorganization" than introducing new code with the new code coming after the 1.0.0 release. Of course I probably need to be better at following my own advice :)
If you make the issue then I can help with coding it as well.
@CSSFrancis, I made an Issue about this here: https://github.com/pyxem/pyxem/issues/1039
Another thing which would be nice is to remove some of the duplicate functionality from the pixstem/pyxem merger: https://github.com/pyxem/pyxem/issues/758#issuecomment-2068147741
With the rather large amount of API-breaking changes I'm doing, would it be an idea to make a new branch for the 1.0.0
release? Or whichever release (0.19.0
?) will contain the API-breaking changes?
Thinking about what we can expect to do before the HyperSpy Diamond workshop (middle of may), I think it it is a bit risky rushing a 1.0.0
release. Simply due to the 1.0.0
signifying that the API should not be breaking within the 1.x.x
.
Thus, I suggest to release a pre-1.0.0
release, maybe 0.19.0
, which contains all of the changes we want to get into the 1.0.0
release. And aim to get the 1.0.0
release out before the workshop in Trondheim (middle of June). This gives us some more time to sure the API works nicely, and we'll have a bit more eyes on the functionality, as people will be updating their notebooks for that workshop.
@magnunor That was my plan! 0.18.0 is kind of the transition. The idea is that the new and the older ways of doing stuff should be there to help make that smoother. We can probably make a 0.18.0 release relatively soon, just need to figure out if we can get https://github.com/pyxem/diffsims/pull/205 and a new version of diffsims out.