xraylarch
xraylarch copied to clipboard
Larch Project File in HDF5
@newville let's discuss here this development.
Should be possible to save/load all the analysis components (pre-edge, PCA, Feff-fit, XRF, etc) fitting models and results history.
HDF5 is preferable (and/or Zarr, for XRF maps too)
From #366:
At the moment, I'm slowly making progress on saving a "project/session file", including Feff fits, pre-edge peaking fitting models, etc. At the moment, this is a custom format using JSON encoding which handles complex nested objects well. There is still some work to go, but it's looking promising. I don't yet have XAS Viewer reading in these projects and fully populating the various displays - that shouldn't be too hard, but might take some time. Getting to the point where we can use HDF5 would be more work, and may not happen immediately....
JSON encoding is a good starting point.
I had a look how would be the best to store/load everything done in a Larch session to/from disk and my main concern is currently the fact that we do not have an object which is a "root container for all data, parameters and settings" and is independent of the GUI. My understanding is that there is the controller
in the GUI who in turn stores the symboltable
from the Larch interpreter. But it's kind of criptic to me how to find where all the data/parameters are kept for each group.
What about creating a "container" object independent of the GUI which holds:
- the Larch interpreter and main session
- list of data groups
- common parameters (e.g. store/handle the selection of groups, journal, log of commands)
- methods to get data/parameters from selected groups (stored per group)
- the logic to save/load everything done during a Larch session
I do not know if all this makes sense. I am trying to disentangle the current design in a model-view-controller pattern, but I may be wrong.
@newville any ideas, comments, thoughts to better understand how to disentangle the GUI part and the Larch interpreter from the data model are welcome.
@maurov
JSON encoding is a good starting point.
Yeah, I sort of think converting the deeply nested objects and dicts to HDF5 will take some effort.
I had a look how would be the best to store/load everything done in a Larch session to/from disk and my main concern is currently the fact that we do not have an object which is a "root container for all data, parameters and settings" and is independent of the GUI. My understanding is that there is the controller in the GUI who in turn stores the symboltable from the Larch interpreter. But it's kind of criptic to me how to find where all the data/parameters are kept for each group.
Well, the interpreter always has a symbol table that is exactly a Group, called _main
. That is the analog of HDF5's root
or /
. Currently "save session" saves the "non-builtin" groups in that _main
group. As a kind of cheat, the _sys
group will have run-time info (config settings) that will not be saved into the "session file".
Currently (work in progress), "reading" a session file returns all the saved data. OTOH, "Loading" a session file could just overwrite any existing session variables. My intention is to have XAS Viewer be more cautious, merging some values so as to "add data" and not "replace data", especially the list of data groups in _xasgroups
. But it does mean all the saved data can be read into Python without automatically installing into the top-level Larch namespace.
There is a pile of top-level objects for data fitting that are really sort of "current working set of data", and all of the results and fit histories go into the "data groups". So, I think it is reasonable to overwrite these "current working objects" as long as the history and results are preserved (and presented when asked).
Hope that helps ... but yeah this needs some careful documentation.