specutils icon indicating copy to clipboard operation
specutils copied to clipboard

Implement "read_many" or similar for SpectrumList

Open eteq opened this issue 4 years ago • 3 comments

Right now SpectrumList uses the same I/O scheme as all the otherloaders, where the implicit assumption is generally a single file is loaded into a single object. However, this is not the case for some applications of SpectrumList. A specific concrete examplesi the JWST MIRI spectrograph in MRS mode, which has as pipeline outputs 12 different 1d spectra that all have different shapes but are also all extracted from the same raw data exposure (effectively for this conversation, it's 12 separate detectors that cover different wavelength ranges). So it seems logical to stick them into a single SpectrumList.

So my proposal is adding a method to SpectrumList along the lines of read_many, which would work pretty much the same as read but explicitly is expecting a list of file-like objects, a list of file names, or if it's a single string it's interpreted as either a directory name or a glob pattern. Each then get individually loaded with Spectrum1D (which I guess read_many would pass kwargsinto), and then stuffed into the SpectrumList.

What I do not know is whether that runs afoul of some of the unified I/O machinery. Someone will have to try it and see I think. It may not even be critical to use the unified I/O machinery at all if we are just defering all the file-specific loading to Spectrum1D, but it's worth at least checking.

(note: this might eventually also be desirable for SpectrumCollection since one can image a pile of files that all have the same shape and wanting to put them into one SpectrumCollection. But I think that's probably best implemented instead with some way to easily "collapse" a SpectrumList into a SpectumCollection, since there's not much of a performance advantage to loading them from SpectrumCollection if each file has to be parsed individually anyway. At any rate, definitely a follow-on since the SpectumList use case is much more concrete.)

eteq avatar Jul 01 '21 15:07 eteq

Does this assume that there's a loader for Spectrum1D (rather than only SpectrumList, which may be the case where there is multiple spectra in a single file)? What would be the distinction between read_many and joining a SpectrumList for each file?

aragilar avatar Jul 05 '21 07:07 aragilar