sherpa icon indicating copy to clipboard operation
sherpa copied to clipboard

Automatically split a model expression into additive components and plot the results

Open DougBurke opened this issue 2 years ago • 19 comments

Summary

Add plot_source_components and plot_model_components calls that will split up a model like gal * (pl + line) into two lines: one for gal * pl and one for gal * line. There are also corresponding get_source_components_plot and get_model_components_plot calls, and the "source_components" and "model_components" arguments can be used with the plot call.

Details

This is a simplified #1022 (which used to be larger but had been split up).

The aim is to let users write models like gal * (pl + line) but to be able to recognize this as two terms, gal * pl and gal * line. We can then plot these component terms separately.

After adding some tests and minor code-level tweaks, there's two basic parts

  1. the sherpa.models.model.model_deconstruct routine which does the model split
  2. the plot code

Whilst model_deconstruct should just be simple grade school PEMDAS/BODMAS/... work, it's a little tricky to do effectively. I have chosen to implement this relatively simply, at the expense of not catching all possible situations. For example

  • this explicitly only looks for numpy operations (add, multiply, divide, subtract) and does not worry about things like exponentiation, reminder, or "integer divide"
  • we are not trying to simplify "a / a - b" to "1 - b", but to two terms "a / a" and "- b" (this follows the general approach of #1972 that we want to try and replicate the original expression as much as possible, but in this case it's obviously not as clear cut).
  • there are some oddities, since "-(a + b * c + d)" will not get changed but "x - (a + b * c + d)" will get expanded to "x", "-(a)", "-(b * c)", and "-(d)", but I do not expect users to be writing such models (e.g. use a unary operator on a model expression) so I would rather not try to address this issue (it is documented and tested).

In fact, we don't actually want to simplify the terms, since we still want to reflect the actual computation (so if a user says "a - a" then we don't want to just say "0" because the code will compute the model a and we don't want to hide this.

Since this is a recursive algorithm I add explicit checks for a recursion error - if found we just stop simplifying - even though it is hard to imagine most users of this functionality are likely to care. I do have tests for this (but since this depends on Python then we can never guarantee the recursion limit is not going to change, so at some point we might want to tweak this, but at present it seems like 1000 iterations is going to trigger the recursion error).

I have simplified the plot changes here - instead those changes have become

  • #1973
  • #1977
  • #1984
  • #1988

Here we now start of with returning a list of plots and then adding a "MultiPlot" class which handles this case, since the list-of-plot case doesn't handle the plot title well. Of course, this means that I've added a new method to the plot backend: set_title. I needed this to be able to over-ride the titles from the individual plot objects, and there didn't' seem to be a way to do this with the existing code. I think it makes sense, but comments are welcome. I also haven't tried the bokeh backend to see if my implementation actually works there.

I thought about how we could use the MultiPlot code in the existing code base. The two obvious choices are

  • for fit plots (i.e. rather than have explicit dataplot and modelplot components we could just send in the plots)
    • unfortunately the user experience would be significantly different (although I guess we could add a subclass to try and paper over the differences; I just don't see it as worth it)
  • the OrderPlot class
    • there's enough I don't understand about this class that I would only want to work on this in a later PR

However, nothing that I feel can or should be done in this PR.

Notes

The plot_model_components call (and the source one) is a great example where the label concept would be useful - e.g.

  • #938

It's not obvious to me if there's other places than plot_xxx_components where it would be useful to be able to separate models. Flux calculations comes to mind. Anything? How do we add this capability to such a piece of the API?

I have tried several different ways of doing this over the years, and this is by-far the most successful, as well as being surprisingly self-contained. For fun I just tried an approach where we track the additive components as the model expression is built up (i.e. within BinaryOpModel.__init__ we can replicate much of the logic of model_deconstruct) but

  • as suggested, it doesn't really make the code simpler than the logic being in model_deconstruct
  • why should every user pay for this feature compared to the current approach of only those wanting it have to spend that extra time
  • perhaps most importantly, doing it during model creation leads to recursion errors and it just doesn't seem to be worth the effort to avoid this (and the issue of recursion errors is where I've come stuck in previous attempts).

Examples

These are from an older version of the code, but they should still behave the same, except that as this is now based on #1972 the model expressions will have a lot-less unnecessary brackets.

xsphabs * (powlaw1d + gauss1d)

Here the parameter values are chosen to not be crazy, given the data, but it's not a good fit to the data so they are just for show

>>> load_pha("3c273.pi")
>>> notice(0.5, 6)
>>> set_source(xsphabs.gal * (powlaw1d.pl + gauss1d.g1)
>>> gal.nh = 0.1
>>> pl.gamma = 1.7
>>> pl.ampl = 2e-4
>>> g1.pos = 4
>>> g1.fwhm = 0.5
>>> g1.ampl = 5e-5

With this we have

>>> plot_data(ylog=True)
>>> plot_model_components(overplot=True, alpha=0.7)
>>> plt.ylim(1e-5, 4e-2)

sherpa1

and

>>> plot_model()
>>> plot_model_components(overplot=True, alpha=0.7)

sherpa2

If we do plot_model_components first we can see that the title isn't great (it is for the first component, and is something to be fixed). We can see how the components compare to the unabsorbed powerlaw with:

>>> plot_model_components(color='black', alpha=0.5)
>>> plot_model_component(pl, alpha=0.5, overplot=True)

sherpa3

IACHEC style model

In [27]: get_source()
Out[27]: <BinaryOpModel model instance '((xsconstant.m1 * xstbabs.m2) * ((((((((((((((((((((xsbremss.m3 + xsbremss.m4) + xsbremss.m5) + xsbremss.m6) + xsgaussian.m7) + xsgaussian.m8) + xsgaussian.m9) + xsgaussian.m10) + xsgaussian.m11) + xsgaussian.m12) + xsgaussian.m13) + xsgaussian.m14) + xsgaussian.m15) + xsgaussian.m16) + xsgaussian.m17) + xsgaussian.m18) + xsgaussian.m19) + xsgaussian.m20) + xsgaussian.m21) + xsgaussian.m22) + xsgaussian.m23))'>

Here we get

plot_model(ylog=True)
plot_model_components(alpha=0.2, overplot=True)
plt.xlim(0, 12)
plt.ylim(1e-4, 1e-2)

which produces

sherpa4

and we can check the terms with

In [28]: from sherpa.models.model import model_deconstruct

In [29]: src = get_source()

In [30]: for i, term in enumerate(model_deconstruct(src), 1):
    ...:     print(f"{i:02d} - {term.name}")
    ...: 
01 - ((xsconstant.m1 * xstbabs.m2) * xsbremss.m3)
02 - ((xsconstant.m1 * xstbabs.m2) * xsbremss.m4)
03 - ((xsconstant.m1 * xstbabs.m2) * xsbremss.m5)
04 - ((xsconstant.m1 * xstbabs.m2) * xsbremss.m6)
05 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m7)
06 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m8)
07 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m9)
08 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m10)
09 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m11)
10 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m12)
11 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m13)
12 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m14)
13 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m15)
14 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m16)
15 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m17)
16 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m18)
17 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m19)
18 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m20)
19 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m21)
20 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m22)
21 - ((xsconstant.m1 * xstbabs.m2) * xsgaussian.m23)

DougBurke avatar Jan 21 '23 03:01 DougBurke

Codecov Report

Attention: Patch coverage is 97.10145% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 83.17%. Comparing base (a47ff14) to head (37a4a12).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1679      +/-   ##
==========================================
+ Coverage   83.09%   83.17%   +0.07%     
==========================================
  Files          81       81              
  Lines       28191    28258      +67     
  Branches     4343     4355      +12     
==========================================
+ Hits        23425    23503      +78     
+ Misses       4656     4645      -11     
  Partials      110      110              
Files Coverage Δ
sherpa/models/model.py 94.45% <100.00%> (+0.19%) :arrow_up:
sherpa/plot/__init__.py 94.01% <100.00%> (+0.06%) :arrow_up:
sherpa/plot/pylab_backend.py 90.66% <100.00%> (+0.08%) :arrow_up:
sherpa/ui/utils.py 97.23% <100.00%> (+0.22%) :arrow_up:
sherpa/plot/backends.py 96.26% <50.00%> (-0.39%) :arrow_down:
sherpa/plot/bokeh_backend.py 92.50% <80.00%> (-0.27%) :arrow_down:

... and 1 file with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update a47ff14...37a4a12. Read the comment docs.

codecov[bot] avatar Jan 21 '23 04:01 codecov[bot]

@anetasie - this adds plot_model_components and plot_source_components calls that will automatically "break apart" a model. So that

set_source(xsphabs.gal * (powlaw1d.pl + xsgausian.line))
...
plot_data(ylog=True)
plot_model(overplot=True)
plot_model_components(overplot=True)

will be like you had manually said

plot_model_component(gal * pl, overplot=True)
plot_model_component(gal * line, overplot=True)

There's some engineering work to be done, but is this useful as is (or missing stuff or doesn't return useful results)?

DougBurke avatar Jan 23 '23 15:01 DougBurke

@DougBurke Thanks. This looks good. I just tried a few plots with the absorption in the model expression. Is this going to be applied in the case of more complex models, with other multiplicative factors such as constant, as for example in Stephanie's paper. What would be a result of 'plot_model_components'?

model = const1 × phabs×
(zpow1 × etable{mytorus Ezero v00.fits}+
const2 × atable{mytorus scatteredH200 v00.fits}+ const3 ×
atable{mytl V000010nEp000H200 v00.fits})

anetasie avatar Jan 25 '23 00:01 anetasie

The source model was created with

set_source(1, m1g1 * m2g1 * (m3g1 * m4g1 + m5g1 * m6g1 + m7g1 * m8g1 + m9g1 * m10g1))

and you get

In [15]: from sherpa.models.model import model_deconstruct

In [16]: mdl = get_source(1)

In [17]: mdl.name
Out[17]: '((xsconstant.m1g1 * xsphabs.m2g1) * ((((xszpowerlw.m3g1 * xstablemodel.m4g1) + (xsconstant.m5g1 * xstablemodel.m6g1)) + (xsconstant.m7g1 * xstablemodel.m8g1)) + (xsconstant.m9g1 * xszpowerlw.m10g1)))'

In [18]: for i, term in enumerate(model_deconstruct(mdl), 1):
    ...:     print(f"{i} - {term.name}")
    ...: 
1 - ((xsconstant.m1g1 * xsphabs.m2g1) * (xszpowerlw.m3g1 * xstablemodel.m4g1))
2 - ((xsconstant.m1g1 * xsphabs.m2g1) * (xsconstant.m5g1 * xstablemodel.m6g1))
3 - ((xsconstant.m1g1 * xsphabs.m2g1) * (xsconstant.m7g1 * xstablemodel.m8g1))
4 - ((xsconstant.m1g1 * xsphabs.m2g1) * (xsconstant.m9g1 * xszpowerlw.m10g1))

If we

plot_fit(1, xlog=True, ylog=True)
plot_model_components(1, overplot=True, alpha=0.5)

then we get (after messing with the ylim to avoid values ~ 1e-10)

sherpa1

For a NuSTAR dataset we get

sherpa2

As I mentioned above, one of the issues is identifying the different lines - ie which line corresponds to which table model - but a user can use the plot_model_component call to try and find out.

DougBurke avatar Jan 25 '23 14:01 DougBurke

This looks great!

anetasie avatar Jan 25 '23 15:01 anetasie

I had @anetasie to review this for the "needs SDS testing" tag. As this has been done, I'm removing that tag and also Aneta as the reviewer. I have to look at some of the implementation issues now, so will leave as a draft.

DougBurke avatar Jan 27 '23 14:01 DougBurke

@DougBurke OK

anetasie avatar Jan 27 '23 22:01 anetasie

@DougBurke Did you plan to include this in the 4.16 release?

anetasie avatar Sep 27 '23 18:09 anetasie

No. I like the behavior but I am not-at-all happy with the code I wrote to get this to work, and it would just be a potential minefield. I had hoped to get an "outside consultant" to go through the logic with me but that fell through. It would require significant testing at the very least, which would have had to be several months ago.

There are a lot of tests involved in this PR but they were written be me and so may be missing not-obvious-to-me problem cases.

DougBurke avatar Sep 27 '23 19:09 DougBurke

Rebased to address minor conflict. I took the time to clean up the commits and tweak how the plot_xxx_components call works, and so "fixing" the plot title (but it is rather anodyne at the moment).

DougBurke avatar Feb 23 '24 22:02 DougBurke

Rebased as #1973 was merged and this changed the use of requires_pylab (and makes those tests actually run, which triggered more changes that needed changing).

It was this PR that led me to finding creating #1971 in the first place ...

DougBurke avatar Feb 29 '24 21:02 DougBurke

Rebasing onto #1972 since they interact (the text of the models changes thanks to the de-bracketification).

DougBurke avatar Mar 03 '24 04:03 DougBurke

Rebased because #1977 was merged

DougBurke avatar Mar 03 '24 16:03 DougBurke

Moving some code to

  • #1984
  • #1988

DougBurke avatar Mar 22 '24 11:03 DougBurke

Rebased because #1984 was merged. I've also removed those commits which now live in #1988

DougBurke avatar Mar 22 '24 13:03 DougBurke

Aha - looks like the bokeh change fails - plot_model_components() ends up raising

...
   3596 if not overplot:
-> 3597     backend.set_title(self.title)
   3599 overplot = True
   3600 clearwindow = False

File /lagado.real/sherpa/sherpa-main/sherpa/plot/bokeh_backend.py:783, in BokehBackend.set_title(self, title)
    773 def set_title(self, title: str) -> None:
    774     """Change the display title.
    775 
    776     Parameters
   (...)
    780 
    781     """
--> 783     self.current_fig.title = title

File /lagado2.real/local/anaconda/envs/sherpa-main/lib/python3.11/site-packages/bokeh/core/has_props.py:344, in HasProps.__setattr__(self, name, value)
    341 if isinstance(descriptor, property): # Python property
    342     return super().__setattr__(name, value)
--> 344 self._raise_attribute_error_with_matches(name, properties)

File /lagado2.real/local/anaconda/envs/sherpa-main/lib/python3.11/site-packages/bokeh/core/has_props.py:379, in HasProps._raise_attribute_error_with_matches(self, name, properties)
    376 if not matches:
    377     matches, text = sorted(properties), "possible"
--> 379 raise AttributeError(f"unexpected attribute {name!r} to {self.__class__.__name__}, {text} attributes are {nice_join(matches)}")

AttributeError: unexpected attribute 'title' to GridPlot, possible attributes are align, aspect_ratio, children, cols, context_menu, css_classes, css_variables, disabled, elements, flow_mode, height, height_policy, js_event_callbacks, js_property_callbacks, margin, max_height, max_width, min_height, min_width, name, resizable, rows, sizing_mode, spacing, styles, stylesheets, subscribed_events, syncable, tags, toolbar, toolbar_location, visible, width or width_policy

DougBurke avatar Mar 22 '24 18:03 DougBurke

That should be as easy as changing self.current_fig.title = title to self.current_axis.title = title if that's what we want. If there is only one plot, that will work. I have to think more about what to do in a multi panel plot. MAybe we should always assign the title to the top let plot. That would be

self.current_fig.children[0][0].title = title

hamogu avatar Mar 22 '24 18:03 hamogu

@hamogu - I'd based my code on the code before #1975 issue and #1976 fix.

DougBurke avatar Mar 22 '24 19:03 DougBurke

This should fix the bokeh issue, but there are questions about what we mean by "title".

DougBurke avatar Mar 22 '24 20:03 DougBurke

Rebased because it conflicted with #1976 - took the advantage to add a bokeh test (to go with the matplotlib test), although it is not very in-depth.

DougBurke avatar Mar 29 '24 13:03 DougBurke

Rebased to pick up changes in #1972 (doing in two commits just to check the individual commits).

DougBurke avatar Apr 02 '24 15:04 DougBurke

Rebased as #1972 has been merged.

DougBurke avatar Apr 04 '24 13:04 DougBurke

First let me say that is very useful and very welcome functionality. I've monkeyed way to do just that in essentially every paper I've ever done with Sherpa. In it's most basic form:

mymodel = xsphaba.abs1 * (xsvapec.v1 + xsvpac.v1)
set_source(mymodel)
fit()
temp = v1.norm.val
v1.norm.val =0
plot_fit()
v1.norm.val = temp
temp = v2.norm.val
v2.norm.val = 0
plot_fit(overplot=True)
v2.norm.val = temp

# and thus do matplotlib stuff to get the colors right

(Of course, I often would have more than 2 additive model components.)

I'm spelling this out here for two reasons:

  • This PR makes it so much better and it's such a common thing to do in preparing publications with Sherpa plots.
  • From the API point of view, I wonder if we need to do something to homogenize or control line colors better, for example plot the summed model in the same color as the data, pass default labels to the lines so one can just call plt.legend() to see what is what. I'm not saying that all this needs to be solved in this PR. I'm just saying it's something not to loose sight of for practical uses. Instead of in code that could also be addressed in e.g. a thread showing example on setting colors or using matplotlib to modify colors after the fact.

hamogu avatar Apr 07 '24 23:04 hamogu

So, riffing on the above comment https://github.com/sherpa/sherpa/pull/1679#issuecomment-2041645871

One thing we could try to identify (but not necessarily implement in this PR) would be what other plots/calls could take advantage of this feature. I can think of

  • residual style plots, where you might want to see the residuals per "component"
  • the various flux calculation codes (in particular the error analysis) where it might be interesting

but maybe there are more.

We may decide that the user would just have to do this manually for these cases (in particular, the flux case has a lot of fun thinking to do with error analysis related to what models are / are not included). For the residual case we don't actually make it easy to say "only use this set of models for the residual calculation", which could spawn a PR or two.

DougBurke avatar Apr 08 '24 14:04 DougBurke

I worry a bit about the complexity of our plotting system. Adding this plot here is fine, but I'm not sure we are in a state to re-wire it for other plots. We might be - I'm just saying that I'm personally not sure we are. The system we are using is somewhat complex to use requiring us to add a new calss and several new functions for every new plot type. In y plotting plans, I had ideas to simplify that, but I'm not sure any of those will pan out in practice.

It is true though that our syntax to combine plots in new ways complex compared to e.g. the syntax of holoviews where plots are overlayed with a simple "*" and put in separate panels with shared axes with "+". With a similar syntax, there would be no need for e.g. "plot_fit_resid" any more, but instead a "fit_plot + res_plot". That would reduce the number of plot types a user needs to know and remember because he/she could easily make up the others by combining exiting plots. But I digress, that's beyond this PR and beyond this discussion and, might in fact not be achievable given the range of plots that we support.

However, I do want to say that the "Multiplot" is what's needed for datastacks. The old (currently mostly broken) datastack plots made one panel per dataset, but often one might want all the datasets in the same plot so that they are easy to compare (e.g. how the break in a powerlaw moves with distance form the central source).

hamogu avatar Apr 08 '24 15:04 hamogu

Yay conflicts!

DougBurke avatar Apr 12 '24 15:04 DougBurke

Argh - stupid patch-resolution failure....

DougBurke avatar Apr 12 '24 15:04 DougBurke

I've merged in the current main branch to fix the conflict and to pick up #2000 - as well as to make it easier for reviewers - but I'd prefer to rebase things once @hamogu has checked my changes out as I'm always a bit wary the merge could have messed things up.

DougBurke avatar Apr 12 '24 16:04 DougBurke

All my concerns have been answered. Feel free to rebase any time you are ready.

hamogu avatar Apr 12 '24 17:04 hamogu

And I'll give it a final approval when the tests pass after that.

hamogu avatar Apr 12 '24 17:04 hamogu