xraylarch
xraylarch copied to clipboard
Jupyter example how to export groups to an Athena project and a Larix file
For exchanging data with users still working with Athena/Artemis, it would be nice to show how to export groups to an Athena project file programmatically.
Furthermore, we should also show how to create a Larix session and save data in it without the GUI.
For information, to export groups to Athena from Larix:
@newville I realize that I am not able by myself to programmatically (pure Python):
- create a Larch session from scratch
- add groups to the session
- save the session to a
.larix
file
Please, could you give here a minimal working example?
(this may be related to #411)
@maurov Yes, I'll include this with #411.
@newville I have seen 430603de3f2a72ba62b6f07690557239bf725cb8, thanks for including such example, it is great, but what I was looking for is an example how to initialize a Larix session programmatically without the GUI, add groups to it and then save to a .larix
file. We need this because we would like to convert our raw data from the beamline to a Larix project file (currently we are using Athena project file as exchange format for Larix, but I would prefer using the Larix format).
@maurov Yep, I understand, just have not gotten that done yet. But also, if you look at that example and the save_session
code at https://github.com/xraypy/xraylarch/blob/master/larch/io/save_restore.py#L69. it is probably not that hard, though maybe we want to break that function apart.
Making a session by hand would "just be" creating a "config" section, a "command history" section (that could be empty) and then a "symbol table" - a Group of datasets and Groups, and with the important _xasgroups
group for Larix to map "displayed file name" to "group name". And then using the "encode4js" function as in save_session
. Again, we could think about breaking that up so it did not assume a Larch session. For example, currently "Sesssion" is just a namedtuple, but it could be turned into a class with load/save methods.
For Larix to be able to work with a Group, it is probably important to check that it has arrays called "xdat", "ydat", "energy" and "mu". It might also assume some other data that normally would be generated with the "install group" method.....
I may have time to work on this today, but I'm not certain.
@newville thanks for explaining it. This is not urgent, so I propose to postpone this to later release.
@newville I am having hard time to correctly read Athena project files (in Larix or via read_athena
) that I have created programmatically with Python. Sometimes (I do not know how to reproduce this!) the names of the groups read from the Athena project file appear as an hash key (=5 random lower-case letters).
Here a minimal/conceptual example to show what I am doing:
#to write
apj = AthenaProject(fname_out)
for something in my_list_of_data:
g = Group(
gname = "my_group_name_that_can_be_a_long_string_but_UNIQUE_123"
id=gname,
name=gname,
groupname=gname,
filename=gname,
xdat=my_energy_array
ydat=my_mu_array
energy=my_energy_array
mu=my_mu_array
datatype='xas')
apj.add_group(g)
apj.save()
#to read back
prj = read_athena(fname_out)
print(prj.groups.keys())
#sometimes like: dict_keys(['fhigt', 'quxkx', 'gamrg', 'rbqec', 'upiyt'])
Do you have a recommendation how to correctly write groups to an Athena project file programmatically in order that they will be correctly read back by Larix, avoiding the hash key names?
I think it would be much easier if a group could have only name
(unique identifier, e.g. the hash key) and a label
for the human that is shown in the group list.
@maurov Hmm, I'm not 100% certain.
A five-letter random string is assigned to each dataset in an Athena Project. That hash key is how Athena keys the data, so that's needed to have Athena reliably read these files.
Athena also keeps a long dictionary of attributes. One of these is called "label".
But here is the basic route:
On saving a Larch group to and Athena file, if the group has a filename
attribute, that is used for the Athena attributes 'label'. Whereas the 5-letter key and the "groupname" are required to be valid Python variable name, this filename / label is not. For better or worse, "filename" is used throughout Larix as the label to use for a group.
On reading the Athena project, if there is a label
in the attributes, that will be used as the "filename" and group name.
Well maybe "should" if not "will" ;)
The intention is to set group.filename.
If the group does not have a filename (or maybe it is blank), the 5-letter hash will be used...
But something like:
for i, dat in enumerat(data_list):
g = Group(filename=f'fdataset_{i}', energy=dat.energy, mu=dat.mu, datatype='xas')
apj.add_group(g)
should work.
@newville thanks for the detailed explanation! Now it is much clear to me. I will use filename
only.