GSAS-II icon indicating copy to clipboard operation
GSAS-II copied to clipboard

How to trigger an update of the GSAS-II binaries?

Open briantoby opened this issue 1 year ago • 2 comments

The GSAS-II binaries are installed when GSAS-II is initially installed. If the binaries are updated, they will not be reinstalled. There needs to be some way to either automatically trigger or manually trigger an update of the binaries.

Seems that this would be best incorporated into the git update process, but how to know that the binary files have been updated?

briantoby avatar Jul 03 '24 02:07 briantoby

I don't think it is a good practice that a piece of software auto-updates itself. It is against the FAIR principle, i.e. one should be able to reproduce old results ... and developers should be able to reproduce bugs (to fix them) .

The python environment got much cleaner and deploying extensions is no more has hard as it used to be in the past.

kif avatar Aug 23 '24 13:08 kif

All GSAS-II updates of the Python code are done only at the request of the user; never automatically. Also, we have always offered the capability to users via the GUI to regress to older versions of the software, to confirm that results are unchanged by version. I'm proud of this.

The focus of this issue has to do with the small amount of non-Python code that most people will install from binariy files that we supply (source code is always distributed of course). This code does not change very much, but the compiled images used to be part of the svn repo and thus were synced to the versioning. They are now placed in a release area on GitHub and for most users will be installed once only when GSAS-II is initially downloaded. There is no mechanism to trigger updating of these routines (or regression to older versions).

The issue here is a desire to sync the binaries to the user-selected version of the Python and source code files. The intent here is that when we add a new capability (or bug fix) via a change in non-Python code, we want to get that out. An example would be that we recently added a capability for searching for magnetic k-vectors that can be greatly sped up via a Cython routine, but just doing a normal git update (from command line or GUI) will only get the source code for that. At present, for most users, to get the compiled version installed, one must do a fresh install.

briantoby avatar Aug 23 '24 16:08 briantoby

This needs to be considered as part of the binary build process, which addressed in PR https://github.com/AdvancedPhotonSource/GSAS-II/pull/40, which is waiting for input from @tacaswell

briantoby avatar Oct 31 '24 03:10 briantoby

I agree with @kif that integrating self-updating into the main source code is a hugely complex problem that is already being solved by projects whos goal is to do package management. I propose thinking of this in three layers:

  • the gsas-ii source repository that has all the source and build system needed to go from source -> a fully functioning gsas-ii assuming that the dependencies and compilers are provided
  • versioned conda-packages from conda-fonrge that are generated from the previous step
  • helper tools (maybe just pixi!) that use the conda-packages (or maybe the git repo) to build from source what ever version people want

I think this will get both the workflow that gsas currently has but also slot well into the standard workflow for Python packages (a conda package that is a stub that bootstraps another installation into the conda environment it is in is clever and a nice bit of technical work but is very counter to expectations about how conda packages work, for example people expect to be able to ask conda what version of something is installed which gsas-ii would then be hidden from). Another big upside of getting gsas-ii packaged on conda-forge is that as new Python's come out (or the fortran complier is updated) things get mostly just taken care of!

tacaswell avatar Jan 01 '25 04:01 tacaswell

I agree that a more standard approach to packaging would make this issue moot. One would install GSAS-II's Python & binary files with appropriate versions of Python and the packages that need to be version-specific (mostly numpy, though occasionally MPL throws us a curve). The reason that we provide an alternate mechanism for installation is that:

  1. Most of our users are not all that savvy when it comes to software development and they need a "Ph.D." (push here dummy) approach to getting the software onto their computer.
  2. We have always tried to make the latest improvements to GSAS-II available to users and with that, alas, comes the latest crop of bugs. We want to make bug fixes available immediately and with minimal needs for download bandwidth (we have lots of users in places where speedy/cheap internet access is not to be taken for granted). Yes, git is not intended for use the way we use it, but it works really well for this.

As long as we are supporting a mode where people can use git to get the latest Python-code updates, how can the code determine that the binaries also need an update? I think the solution for this is that one of the Fortran routines (or perhaps the one C routine) should have an option to return the version number of the code at compilation time. Then it becomes relatively easy for the Python code to recognize that newer binaries are needed.

Perhaps someday we can move to a normal development/release cycle and get away our git-based update process (except for developers who want the latest and buggiest), but even then our average user is still unlikely to have compilers installed, but it will be nice to have a streamlined installation/build process that can be automated.

briantoby avatar Jan 02 '25 00:01 briantoby

Hi Brian,

I agree with you with you about the decrease of average level in computing knowledge of our users, but on the other hand the tools used to distribute the code have made deployment so much simpler that it does not matter anymore. Conda comes with compiler packaged. Just declare C/C++/Fortran compiler as build-time dependencies and they will be installed automatically, this regardless to the platform used by you users.

As long as we are supporting a mode where people can use git to get the latest Python-code updates, how can the code determine that the binaries also need an update? I think the solution for this is that one of the Fortran routines (or perhaps the one C routine) should have an option to return the version number of the code at compilation time. Then it becomes relatively easy for the Python code to recognize that newer binaries are needed.

It is good practice to have a date field or so in each of the extension to be able to known when was the last modification done. This mechanisme should be independant from git, or from the filesystem file-access time, but your IDE should be able to update automatically the field upon saving.

Perhaps someday we can move to a normal development/release cycle and get away our git-based update process (except for developers who want the latest and buggiest), but even then our average user is still unlikely to have compilers installed, but it will be nice to have a streamlined installation/build process that can be automated.

As Thomas mentionned, compiler are shipped as conda package on all platforms (gcc at least). This is no more an issue for developpement or advanced users. For distribution, it is still advised to use the Apple's clang and MSVC for windows since the binaries will require less gcc-linked libraries, but this is burden only for the packager, not for the end-user.

Happy new year.

Jerome

kif avatar Jan 02 '25 07:01 kif

the sources/meson.build now runs tagbinaries.py which puts an ASCII file with version information into the directory with GSAS-II binaries. The GSAS-II does look this up when the binaries are located. At present nothing like this is done, but if at some point in the future we need to check on if the binaries are new enough, it can be done.

briantoby avatar Apr 30 '25 04:04 briantoby