specutils [WIP] Continuum normalization

This is addressing #231. To start, two continuum fitting functions have been defined:

fit_continuum_generic takes in a spectrum and fits a user-specified model (from astropy.modeling) with iterative sigma clipping. Exclusion regions can be specified. This is the most minimal implementation of the interface suggested here
fit_continuum_linetools that is a direct port of the algorithm in linetools. The only change is making it compatible with specutils.Spectrum1D and changing the Akima spline to scipy's version. This does not yet satisfy the expected interface.

Needed before PR is complete:

[ ] unit tests
[ ] examples in docs
[ ] ideally an astropy.modeling subclass that implements spline-like models (and possibly a corresponding fitter class?)
[ ] proper implementation of quantities (I have just turned everything into an array ignoring units)

Advice/examples of how to organize unit tests/testing data would be very helpful! I have also been unable to write/run tests because most of my spectra cannot be read by the current spectrum readers. These are spectra that are compatible with IRAF (multispec format).

Aug 27 '18 06:08 alexji

Great work @alexji !

Shall I try to add a script like lt_continuum and give it a test drive?

ps. Testing is going to be a challenge here, no doubt..

pps. There is an extensive module in linetools for I/O: readspec.py Maybe it is time I try to port a bunch of it. That said, IRAF is a defunct code ;)

Aug 28 '18 12:08 profxj

This is important to do, but right now I don't think anything will have changed because it is almost literally a direct port.

However after the next step (wrapping the output spline into an astropy model) I think this would definitely be worth it to make sure all the interfaces still make sense.

Sep 05 '18 17:09 alexji

A few things on this:

This has collided a bit with the work in #210 (which is based on the notebook that I linked but has evolved a bit as things do upon implementation). @brechmos-stsci has made some very basic (even more "generic" than yours I think) start on continuum fitting as part of that that has a similar interface to this, but slightly different - I'll let @brechmos-stsci speak to that in more detail.
That said, this is definitely more thorough/useful for many science cases/etc. The question is more one of naming: I'm not a fan of linetools being in the name because that's not very descriptive of the algorithm and/or use case. What about something like fit_continuum_qso (although that actually seems too specific to me, but I had trouble coming up with something better...)? Similarly, perhaps fit_continuum_generic should be fit_continuum_sigmaclip? (The one in #210 is somewhat similar but uses median smoothing... So maybe it should be fit_continuum_median?)
We are trying to sprint to get a release of specutils out at the end of this week. That might be a bit agressive for getting this in unless you have lots of time the rest of this week, @alexji - do you mind if we aim this for 0.5 (which doesn't yet have a specific timeline, but should at least be before the end of the year)

Sep 05 '18 19:09 eteq

@eteq and @alexji Yes, my continuum fitting is super simple. The PR #210 that @eteq references provides more of a framework for the continuum fitting (and is based on a basic model fitting). I think this PR could reasonably easily be fit into that framework once #210 is merged. I am happy to help with guidance in any way.

Sep 05 '18 19:09 brechmos-stsci

@eteq yes, 0.5 sounds good to me! @brechmos-stsci the #210 framework looks good and I will adapt this to match that.

fit_continuum_sigmaclip sounds good. I realized later it is somewhat duplicating functionality provided in astropy.modeling.fitting.FittingWithOutlierRemoval link, but if we allow the user to specify the fitter then reimplementing this seems inevitable.

Sep 05 '18 20:09 alexji

@alexji - one other thing that just occurred to me: I think moving forward we will want to support spline fitting using an astropy.modeling if we can. It may be ok to sort of "hack" it for this particular application, but do you think it might be possible to prototype out a modeling-compatible fitter and matched model? The hope would be that we could include that here so that it doesn't block further development, but then try to port it back to astropy if possible?

Sep 06 '18 20:09 eteq

Yeah, I agree this is a good idea!

The only way I have thought to do this is to have spline-specific models and fitters that are meant to work together, and will error if the wrong model/fitter combination are used. Otherwise I think we would need to reimplement the highly optimized spline-specific fitters.

If that sounds good, then I will make relevant subclasses and include them here in a way that hopefully can be easily ported to astropy.

(Edit: timescale for this is probably about 2 weeks)

Sep 06 '18 20:09 alexji

spline-specific models and fitters that are meant to work together

Yes, exactly what I was thinking, @alexji . It's a bit of a violation of the idea that they are mix-and-match but I think that makes sense for splines. Although now that I think about it, in principal you could use any arbitrary fitter as long as the model exposes all the parameters (knot locations + coeffs), you just probably wouldn't want to use anything other than the spline-specific one.

Sep 06 '18 21:09 eteq

For what it's worth, I've had the most success with scipy.interpolate.UniverariateSpline, although it generally takes several iterations of fiddling with s and k before I am happy or fed up.

That routine has get_coeffs and get_knots methods, which might be handy. It has other methods as well, which I suppose one might like to have available within the Astropy version.

Sep 06 '18 21:09 hcferguson

@eteq I have put in a very basic spline fitter version and a simple test (rebased onto current version). Basically I have just rewrapped scipy's splines into the astropy modeling framework.

One thing I realized is there's basically two modes of spline fitting: you can either specify knots, or you can have it automatically determine knots when fitting. The former makes it straightforward to expose knots etc in the modeling framework. The latter is often convenient to do, but mixes model definition and fitting. For now I have just allowed both, but I wonder if we should scrap automatic knot determination from data (or separate it into a separate function)?

Sep 09 '18 00:09 alexji

@alexji this is back on our radar as something we want to get merged in, but it will likely need to be updated to match up with the current API. I'll be digging into what exactly needs to be changed soon, but I wanted to check in with you first and see if you have the availability/desire to do the updates. I'm happy to take this on if not, just let me know either way.

Apr 28 '20 15:04 rosteen

@rosteen it sounds like you have a plan in mind for this, so you should go ahead! I'm happy to contribute in any way that is helpful, but I haven't been paying attention to the other API changes so would need some time to get back up to speed.

Apr 28 '20 19:04 alexji

@alexji I actually haven't had time to dig in and figure out exactly what needs to be updated/formulate a plan yet - I think that @nmearl or @eteq have a good idea of what updates are needed here and can chime in with the answer sooner than I could. This isn't super time sensitive, so as long as you're willing it would be awesome if you could do the updates. I just wanted to offer to step in if you were too busy or no longer had any interest.

Apr 30 '20 14:04 rosteen

Yeah I am happy to do it, especially if not time sensitive. I can finish this before the end of May. But it would be very helpful to be pointed to any resource laying out the anticipated API.

May 01 '20 00:05 alexji

@alexji Do you mind if I rebase and push directly to your branch with some low-level fixes?

May 06 '20 17:05 nmearl

Not at all!

On Wed, May 6, 2020 at 10:15 AM Nicholas Earl [email protected] wrote:

@alexji https://github.com/alexji Do you mind if I rebase and push directly to your branch with some low-level fixes?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/astropy/specutils/pull/291#issuecomment-624776856, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAXQRDXLQ7JISXEQGL6Q3E3RQGLKHANCNFSM4FRU6LMQ .

May 06 '20 17:05 alexji

It looks like there have been updates to the API for astropy.modeling since I wrote the SplineModel class. I will go through this and fix it up.

For now, we can put it into specutils.fitting.spline, but ultimately this should probably go into astropy.modeling (cf https://github.com/astropy/astropy/issues/7522)

I will also see if I can implement the linetools algorithm as an astropy fitter. My concern is the number of knots (i.e. number of parameters in the spline model) will change after running the fitter, and I think changing the number of parameters in an astropy model is not good.

May 07 '20 22:05 alexji

OK I have fixed up the SplineModel so it passes the test (at least locally, travis is running). For the fitter, I am not sure what methods have to be implemented.

It is not possible to define a fixed set of parameters for all classes of spline models that the fitter than determines, because by default a spline will automatically determine knot locations from the data. However it is possible to do this if you force the user to define knots as part of the model. This is probably the most common use case. Should I restrict the SplineModel to only be this case?

May 08 '20 00:05 alexji

I should have time to pull down the latest changes and think more about the knot question today.

My initial thought is that while it would be nice to be able to specify the number of knots without explicitly defining their location, if that's a big effort it would be ok to require the user to explicitly define the knots as part of the model in the interest of having at least some usable spline fitting. We can iterate later to add functionality.

It's been quite a while since I was using splines for science - @eteq do you want to chime in to agree/disagree that user-defined knots are the most common use case anyway?

May 11 '20 13:05 rosteen

@rosteen I'm used to specifying the number of knots rather than their locations for this kind of application. Also, my recollection is that there are huge performance advantages to having data on a regular grid, at least for the 2D case in scipy (RectBivariateSpline is much faster than BivariateSpline). Not sure if the same goes for 1D splines.

Also, I've not seen it applied in the context of continuum fitting, but Gaussian Process Regression might be an interesting alternative to splines, potentially even making use of the uncertainties. There's a package in scikit-learn. [https://scikit-learn.org/stable/modules/gaussian_process.html](Gaussian Process Regression).

May 11 '20 14:05 hcferguson

@hcferguson 's common use case of specifying knot number is easy to include. It just involves a helper class to automatically generate evenly spaced knots over the spectrum's dispersion axis and pass that into the other model [and probably creating a convenience class for this type of model].

May 11 '20 16:05 alexji

@alexji I was thinking pretty much the same thing about auto-generating the N knots. @hcferguson would that solution fit with your needs? We'd certainly document what exactly specifying the number of knots does, in terms of them being fixed rather than free-floating.

May 12 '20 16:05 rosteen

specutils specutils copied to clipboard

[WIP] Continuum normalization

specutils
specutils copied to clipboard