gammapy icon indicating copy to clipboard operation
gammapy copied to clipboard

Problem of defining a SkyModel for simple 1D analysis from the high level interface.

Open JouvinLea opened this issue 4 years ago • 11 comments

Hi all, As I mentioned to Regis, I tried the high level interface for all of my Gammapy analysis. I am for the moment only doing 1D spectral analysis. When I try to adapt the notebook of doing a 3D analysis with the high level interface to perform a very basic 1D analysis, It was really strange for me to have to define a spatial component for the model.... I had to go in the test to understand that including for 1D analysis, we have to define a spatial component since in the high level interface it is using a SkyModel for the analysis.model.

Honestly, I don't think it is a good idea and I guess that a lot of standard user would find it also strange to define a spatial component for 1D analysis. Furthermore, when we have our fit performed and we want to inspect a bit the model from the analysis object, we have to write that to access the spectral part analysis.model.skymodels[0].spectral_model whereas for 1D fitting I am only interested in this component.

What do you think?

JouvinLea avatar Nov 01 '19 11:11 JouvinLea

There's pros and cons, but overall I think having a 3D model for 1D spectral analysis is a good thing. For Galactic sources it's very common that there's overlap with emission from other components, and that the source of interest is an extended source. As far as I can see, having a 3D model is the only sane way to keep track of contamination and containment fractions in the 1D region-based spectral analysis.

So basically, I think the power for analyses and the simplicity to have one standard way is the way to go, instead of to invent a special limited second way where only a spectral model is needed. Note that very soon we'll add time-dependent model component, which is then needed for AGN or binaries or pulsars. Again: the general model class will have that optional component anyways, and re-implementing time-dependence on just the spectral model objects will be painful for us, and possibly also more complex for users to learn a second way to compose models. And yet another thing: we'll want to account for exposure and background variations across an on region eventually, and for that a spatial model component is needed.

I think there's some things we can do to make this easier for users: Number 1 is documentation, just explain how it works and how to get the results for 1D analysis. Then maybe add some conveniences, such as e.g. we could consider to auto-add a point source model at the center of the region, if a users starts a 1D spectral analysis without a spatial model set.

Adding "high-priority" label, v0.15, and @adonath - assigning to you.

cdeil avatar Nov 03 '19 21:11 cdeil

Another comment about the high-level interface: I think the current Analysis class is mostly useful for data reduction, and our high-level interface for modeling & fitting is gammapy.modeling and datasets. I would invest effort to make that better, not to expose modeling & fitting via the YAML config file. Modeling & fitting is something best done iteratively, using a Python API, like in Fermipy. Note that also in HESS & HAP we just do data reduction via the config file. For modeling & fitting, you run a separate tool, such as FitSpectrum, where you try different models.

I'm not sure if this view what the high-level interface is is agreed or controversial, and as far as I know there's no open issues describing the remaining work to be done for the Analysis & Datasets & Fit classes. A good way to go might be if someone writes a short docs section outlining roughly where we're going in the next 2-3 weeks that we have left with the high-level interface. @adonath @Bultako - Thoughts?

cdeil avatar Nov 03 '19 21:11 cdeil

@cdeil I undertand your point but I am just saying that for the moment with MAGIC data we are far from a 3D analysis and a lot of people are only interested in 1D spectrum and LC. " This "Then maybe add some conveniences, such as e.g. we could consider to auto-add a point source model at the center of the region, if a users starts a 1D spectral analysis without a spatial model set." could be a good solution. I do think also that in the case you produce only a 1D spectrum, analysis.model should be directly a spectralModel not a SkyModel.

But maybe as you said if at least there is a clear notebook, it will be easier!!!

JouvinLea avatar Nov 05 '19 11:11 JouvinLea

A good way to go might be if someone writes a short docs section outlining roughly where we're going in the next 2-3 weeks that we have left with the high-level interface.

I've added issues and notes into the gammapy.scripts project.

Then maybe add some conveniences, such as e.g. we could consider to auto-add a point source model at the center of the region, if a users starts a 1D spectral analysis without a spatial model set.

I'm +1 to add this.

Bultako avatar Nov 05 '19 12:11 Bultako

I actually want to start working on this issue this week on a lower level. So the idea would be to make SpectrumDataset.model and SpectrumDatasetOnOff.model a SkyModels object as well. From our side there are a few advantages:

  • Uniform API between the datasets
  • Uniform model serialisation
  • You can fit the sum of multiple spectral models

The difference in API becomes:

spectral_model = PowerLawSpectralModel()
model = SkyModel(spectral_model, name="crab")
dataset = SpectrumDataset(model=[model])
dataset.model["crab"].spectral_model

Compared to what we have now:

spectral_model = PowerLawSpectralModel()
dataset = SpectrumDataset(model=spectral_model)
dataset.model

I think all in all it's not that bad. But I agree users have to get used to the way of accessing the spectral model via model["crab"].spectral_model instead of directly getting the spectral model. But the nice thing is, that it will be uniform for 1D and 3D.

The remaining question for me is, whether we should make the spatial model completely optional in the SkyModel or whether we just define a "pseudo" spatial model for spectral analysis. Making the spatial model completely optional introduces a lot of code like if spatial_model is None: in our SkyModel class (during evaluation, parameter concatenation, model serialisation in both directions), which I don't like that much.

Alternatively we could introduce a "pseudo "spatial model that does nothing and also does not have any parameters, in case e.g. you don't know anything (or don't care) about the morphology of the source. Basically the ConstantSpatialModel, but with no norm parameters either. Maybe a NoneSpatialModel() or NoSpatialModel() placeholder?

adonath avatar Nov 05 '19 13:11 adonath

So the YAML would look like:

components:
- name: crab
  type: SkyModel
  spatial:
    type: NoneSpatialModel
  spectral:
    type: PowerLawSpectralModel
    parameters:
    - name: amplitude
      value: 1.0e-12
      unit: cm-2 s-1 TeV-1
    - name: index
      value: 2.0
      unit: ''
    - name: reference
      value: 1.0
      unit: TeV
      frozen: true

Or we ignore it completely during I/O and the resulting YAML file looks like:

components:
- name: crab
  type: SkyModel
  spectral:
    type: PowerLawSpectralModel
    parameters:
    - name: amplitude
      value: 1.0e-12
      unit: cm-2 s-1 TeV-1
    - name: index
      value: 2.0
      unit: ''
    - name: reference
      value: 1.0
      unit: TeV
      frozen: true

adonath avatar Nov 05 '19 13:11 adonath

I agree with the proposal by @adonath, for the reasons he outlined already. There are many benefits to making the model for 1D analysis a 3D model, but in terms of analysis use cases, and in terms of simplicity and maintainability of Gammapy. Personally I did HGPS, and from that experience know that extended and overlapping sources are very common in the Galactic plane, and having a framework that supports contamination and containment corrections like what we get by putting 3D models would have been very valuable (see section 4.11.2 in HGPS paper, he had to hack the spectral analysis and apply all the right factors manually in all the right places, not a sustainable solution for 100s of CTA users IMO). For lightcurves it's the same: the Galactic binaries aren't isolated, they have underlying diffuse emission and nearby overlapping extended pulsar wind nebulae that need to be modeled to get a good binary measurement. As @adonath showed with code examples, the cost and extra complexity for a normal spectral analysis of an isolated source is very small.


I actually want to start working on this issue this week on a lower level. So the idea would be to make SpectrumDataset.model and SpectrumDatasetOnOff.model a SkyModels object as well.

Should be models (plural), no? See also #2102

NoneSpatialModel or None

+1 for using None, for the spatial and temporal model part.

At least I don't see any advantage to introducing a NoneSpatialModel or NoneTemporalModel. Is it really true that you can avoid the if model.spatial_model is None in the 2-3 places (e.g. evaluation) where it's needed if using None? Would NoneSpatialModel be the same as ConstantSpatialModel with a norm of 1 sr-1 or how would you make it work during model evaluation?

Or we ignore it completely during I/O

+1 for that solution: if model.spatial_model is None, don't store a spatial model in the output file. Yes, this requires an if / else on read and write, but I think that's acceptable, preferred to having to come up with some hack to make model evaluation work without if/else for that case.

cdeil avatar Nov 06 '19 08:11 cdeil

@JouvinLea - One more argument: I think the long-term solution for Gammapy might be to unify the 1D and 3D datasets, where 1D just has a single spatial bin. So already putting a 3D model on the 1D dataset is moving the codebase and interfaces into that direction, to have everything more uniform.

cdeil avatar Nov 06 '19 08:11 cdeil

After #2620 and #2619, the spatial model in the SkyModel is optional on serialisation and evaluation. The functionality is shown in the second level API, where we now use model = SkyModel(spectral_model=PowerLawSpectralModel()) consistently, but we don't have any example yet, that shows that a model.yaml can be defined without spatial model as well. This should be done as part of the effort to improve the docs for v1.0.

adonath avatar Dec 02 '19 09:12 adonath

@adonath @JouvinLea

I have tested the 1D fitting using a model with no spatial component, with the high-level interface. Everything seems to work nicely. Should we close this issue?

config = AnalysisConfig.read("../gammapy/analysis/config/example-1d.yaml")
analysis.get_observations()
analysis.get_datasets()

model_config = """
components:
- name: crab
  type: SkyModel
  spectral:
    type: PowerLawSpectralModel
    parameters:
    - name: amplitude      
      value: 1.0e-12
      unit: cm-2 s-1 TeV-1
    - name: index
      value: 2.0
      unit: ''
    - name: reference
      value: 1.0
      unit: TeV
      frozen: true
"""

analysis.set_models(model_config)
analysis.run_fit()


Fitting datasets.
OptimizeResult

	backend    : minuit
	method     : minuit
	success    : True
	message    : Optimization terminated successfully.
	nfev       : 117
	total stat : -289.10

Bultako avatar Mar 05 '20 19:03 Bultako

@Bultako In #2817, I have modified the analysis 1d test to read a YAML file that specifies the spectral model only. I can confirm it works.

The remaining action item here is to add an analysis tutorial for a 1D analysis with the HLI.

adonath avatar Mar 09 '20 08:03 adonath

Addressed by #4044.

adonath avatar Sep 28 '22 17:09 adonath