jdaviz
jdaviz copied to clipboard
[Feature Request] let imviz load jwst data model objects directly
Jdaviz component
Imviz
Description
The documentation for Imviz.load_data implies that it should be possible to load asdf-in-fits files from jwst directly. However, when I try to do that (see "how to reproduce"), I get the error below.
---------------------------------------------------------------------------
NotImplementedError Traceback (most recent call last)
Cell In[59], line 1
----> 1 imviz.load_data(im)
File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/helper.py:174, in Imviz.load_data(self, data, data_label, show_in_viewer, **kwargs)
172 if data_label:
173 kwargs['data_label'] = data_label
--> 174 self.app.load_data(data, parser_reference='imviz-data-parser', **kwargs)
176 # find the current label(s) - TODO: replace this by calling default label functionality
177 # above instead of having to refind it
178 applied_labels = []
File ~/REDACTED/python3.12/site-packages/jdaviz/app.py:851, in Application.load_data(self, file_obj, parser_reference, **kwargs)
848 parser = data_parser_registry.members.get(data_parser)
850 if parser is not None:
--> 851 parser(self, file_obj, **kwargs)
852 else:
853 self._application_handler.load_data(file_obj)
File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:149, in parse_data(app, file_obj, ext, data_label, parent, cache, local_path, timeout)
147 _parse_image(app, pf, data_label, ext=ext, parent=parent)
148 else:
--> 149 _parse_image(app, file_obj, data_label, ext=ext, parent=parent)
File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:227, in _parse_image(app, file_obj, data_label, ext, parent)
225 if data_label is None:
226 data_label = app.return_data_label(file_obj, ext, alt_name="image_data")
--> 227 data_iter = get_image_data_iterator(app, file_obj, data_label, ext=ext)
229 # Save the SCI extension to this list:
230 sci_ext = None
File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:217, in get_image_data_iterator(app, file_obj, data_label, ext)
214 data_iter = _roman_asdf_2d_to_glue_data(file_obj, data_label, ext=ext)
216 else:
--> 217 raise NotImplementedError(f'Imviz does not support {file_obj}')
219 return data_iter
NotImplementedError: Imviz does not support <ImageModel(1994, 2404) from contents>
Note if I load using stdatamodels.asdf_in_fits instead of jwst.datamodels I get a different error but still an error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[61], line 1
----> 1 imviz.load_data(im)
File ~/casA_light_echos/.pixi/envs/default/lib/python3.12/site-packages/jdaviz/configs/imviz/helper.py:174, in Imviz.load_data(self, data, data_label, show_in_viewer, **kwargs)
172 if data_label:
173 kwargs['data_label'] = data_label
--> 174 self.app.load_data(data, parser_reference='imviz-data-parser', **kwargs)
176 # find the current label(s) - TODO: replace this by calling default label functionality
177 # above instead of having to refind it
178 applied_labels = []
File ~/REDACTED/python3.12/site-packages/jdaviz/app.py:851, in Application.load_data(self, file_obj, parser_reference, **kwargs)
848 parser = data_parser_registry.members.get(data_parser)
850 if parser is not None:
--> 851 parser(self, file_obj, **kwargs)
852 else:
853 self._application_handler.load_data(file_obj)
File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:149, in parse_data(app, file_obj, ext, data_label, parent, cache, local_path, timeout)
147 _parse_image(app, pf, data_label, ext=ext, parent=parent)
148 else:
--> 149 _parse_image(app, file_obj, data_label, ext=ext, parent=parent)
File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:232, in _parse_image(app, file_obj, data_label, ext, parent)
229 # Save the SCI extension to this list:
230 sci_ext = None
--> 232 for data, data_label in data_iter:
233
234 # if the science extension hasn't been identified yet, do so here:
235 if sci_ext is None and data.ndim == 2 and ('[DATA' in data_label or '[SCI' in data_label):
236 sci_ext = data_label
File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:440, in _roman_asdf_2d_to_glue_data(file_obj, data_label, ext)
437 ext_list = (ext, )
439 roman = file_obj.tree.get('roman')
--> 440 meta = roman.get('meta', {})
441 coords = meta.get('wcs', None)
443 for cur_ext in ext_list:
AttributeError: 'NoneType' object has no attribute 'get'
How to Reproduce
from jdaviz import Imviz
import jwst.datamodels
image = jwst.datamodels.open('https://archive.stsci.edu/hlsps/jwst-ero/hlsp_jwst-ero_jwst_miri_stephansquintet_f770w_v1_i2d.fits')
imviz = Imviz()
imviz.load_data(image)
can substitute stdatamodels.asdf_in_fits for jwst.datamodels and I still get an error
Expected behavior
The image should load without complaint.
Browser
Latest Stable Chrome
Jupyter
import sys; print("Python", sys.version) Python 3.12.7 | packaged by conda-forge | (main, Oct 4 2024, 16:05:46) [GCC 13.3.0] import numpy; print("Numpy", numpy.version) Numpy 2.1.2 import astropy; print("astropy", astropy.version) astropy 6.1.4 import matplotlib; print("matplotlib", matplotlib.version) matplotlib 3.9.2 import scipy; print("scipy", scipy.version) scipy 1.14.1 import skimage; print("scikit-image", skimage.version) scikit-image 0.24.0 import asdf; print("asdf", asdf.version) asdf 3.5.0 import stdatamodels; print("stdatamodels", stdatamodels.version) stdatamodels 1.10.1 import gwcs; print("gwcs", gwcs.version) gwcs 0.21.0 import regions; print("regions", regions.version) regions 0.10 import specutils; print("specutils", specutils.version) specutils 1.18.0 import specreduce; print("specreduce", specreduce.version) specreduce 1.4.1 import photutils; print("photutils", photutils.version) photutils 2.0.1 import astroquery; print("astroquery", astroquery.version) astroquery 0.4.7 import yaml; print("pyyaml", yaml.version) pyyaml 6.0.2 import asteval; print("asteval", asteval.version) asteval 1.0.5 import idna; print("idna", idna.version) idna 3.10 import traitlets; print("traitlets", traitlets.version) traitlets 5.14.3 import bqplot; print("bqplot", bqplot.version) bqplot 0.12.43 import bqplot_image_gl; print("bqplot-image-gl", bqplot_image_gl.version) bqplot-image-gl 1.5.0 import glue; print("glue-core", glue.version) glue-core 1.21.1 import glue_jupyter; print("glue-jupyter", glue_jupyter.version) glue-jupyter 0.23.0 import glue_astronomy; print("glue-astronomy", glue_astronomy.version) glue-astronomy 0.10.0 import echo; print("echo", echo.version) echo 0.9.0 import ipyvue; print("ipyvue", ipyvue.version) ipyvue 1.11.1 import ipyvuetify; print("ipyvuetify", ipyvuetify.version) ipyvuetify 1.10.0 import ipysplitpanes; print("ipysplitpanes", ipysplitpanes.version) ipysplitpanes 0.2.0 import ipygoldenlayout; print("ipygoldenlayout", ipygoldenlayout.version) ipygoldenlayout 0.4.0 import ipypopout; print("ipypopout", ipypopout.version) ipypopout 2.0.0 import jinja2; print("Jinja2", jinja2.version) Jinja2 3.1.4 import solara; print("solara", solara.version) solara 1.40.0 import vispy; print("vispy", vispy.version) vispy 0.14.3 import sidecar; print("sidecar", sidecar.version) sidecar 0.7.0 import jdaviz; print("Jdaviz", jdaviz.version) Jdaviz 4.0.0
Software versions
import platform; print(platform.platform()) import sys; print("Python", sys.version) import numpy; print("Numpy", numpy.version) import astropy; print("astropy", astropy.version) import matplotlib; print("matplotlib", matplotlib.version) import scipy; print("scipy", scipy.version) import skimage; print("scikit-image", skimage.version) import asdf; print("asdf", asdf.version) import stdatamodels; print("stdatamodels", stdatamodels.version) import gwcs; print("gwcs", gwcs.version) import regions; print("regions", regions.version) import specutils; print("specutils", specutils.version) import specreduce; print("specreduce", specreduce.version) import photutils; print("photutils", photutils.version) import astroquery; print("astroquery", astroquery.version) import yaml; print("pyyaml", yaml.version) import asteval; print("asteval", asteval.version) import idna; print("idna", idna.version) import traitlets; print("traitlets", traitlets.version) import bqplot; print("bqplot", bqplot.version) import bqplot_image_gl; print("bqplot-image-gl", bqplot_image_gl.version) import glue; print("glue-core", glue.version) import glue_jupyter; print("glue-jupyter", glue_jupyter.version) import glue_astronomy; print("glue-astronomy", glue_astronomy.version) import echo; print("echo", echo.version) import ipyvue; print("ipyvue", ipyvue.version) import ipyvuetify; print("ipyvuetify", ipyvuetify.version) import ipysplitpanes; print("ipysplitpanes", ipysplitpanes.version) import ipygoldenlayout; print("ipygoldenlayout", ipygoldenlayout.version) import ipypopout; print("ipypopout", ipypopout.version) import jinja2; print("Jinja2", jinja2.version) import solara; print("solara", solara.version) import vispy; print("vispy", vispy.version) import sidecar; print("sidecar", sidecar.version) import jdaviz; print("Jdaviz", jdaviz.version)
@eteq , is still a problem on your end? I cannot reproduce the error. The image loads fine with jdaviz 4.1.dev (current main).
@eteq you can load this file just fine, by passing the filename to imviz.load_data() rather than passing the opened datamodels object. Our documentation says that you need extra steps to load the actual datamodels object, as described here: https://jdaviz.readthedocs.io/en/stable/imviz/import_data.html#jwst-datamodels/. I just checked and that method does still work in this case.
First I just want to confirm: @pllim , when you say you got it to work, do you mean you did it the https://jdaviz.readthedocs.io/en/stable/imviz/import_data.html#jwst-datamodels way, or the way my snippet shows above?
@rosteen - thanks for the pointer, I didn't notice that snippet! That said, it has two major problem that makes it not work for my use case:
- It copies the data into a new nddata, which is both a problem from a "what if I want to modify it" perspective and a "run out of memory" perspective. The fix is probably just one or two
copy=False's, though, right? - The metadata isn't present, which in practice is a huge limitation when you load multiple files and want to figure out which is which in the UI
And one minor problem:
- It's a lot of code for what should be one of the most basic user cases!
Is there a reason why a more complete version of that snippet (e.g. with the copy=False's in place and the metadata copied over) can't be implemented? I forget now if imviz uses unified-I/O or similar, but that would seem to be the way to do it...
Yes sorry, I was only trying the ways we officially support, not the exact snippet using objects.
I wouldn't call loading model objects into Imviz a basic use case. Most users would pass in the filename and be done with it.
I disagree! There are many science use cases where the objects are the only available object. It is very not safe to assume a file is even present, and the regularity of which that's assumed is probably my single top "minor-to-moderate" annoyance when interacting with code as a scientist (not jdaviz specifically, which is much better than most, to be fair! I'm talking about the whole Python ecosystem).
At any rate, to be clear, I'm willing to spin up a PR myself to do this since I think I know the API-level interface of relevance. What I don't know is where the best place is to put more specific code like this - i.e., is there a unified I/O hook somewhere, or would I need to add a case to load_data? (and if the latter, is that an argument for trying to implement some sort of unified I/O or plugin approach?)
There are many science use cases where the objects are the only available object. It is very not safe to assume a file is even present
Can you please clarify? Did someone send you a pickled object instead of the JWST data product?
Checking for object at the load_data level is problematic because you need to import the object class at app level, which makes the package that holds the class a required dependency (in this case, it might be the entire pipeline). Unless you are willing to settle for static cls.__name__ checks only, which is pretty rigid because it does not account for subclasses or mixins.
And this is sounds like a feature request, not a bug.
Ah, point taken this is a feature request. Will re-label!
But still, how does a scientist only has access to an object and not the actual file?
Can you please clarify? Did someone send you a pickled object instead of the JWST data product?
Some examples:
- Data streamed from an online source such that you have an object in-memory but not on-disk
- Data objects that you load in a notebook, then modify in some way before displaying (this is my use case here, which of course is possible with the various snippets, just in convenient)
- Data objects that can be loaded from a file, but in a workflow one is exploring the data so needs to load them into python first, then do some exploration before knowing how/what you want to view. Then it's more natural to load the data object rather than the file by name (since otherwise you have to track the filename since some dataobjects, like jwst datamodels, don't carry an absolute path with them, which is error-prone and annoying).
In practice, the majority of my image use cases (probably of order 80%) fall under one of those three. Additionally, 1 will be much more common in a science platform setting, which will be mandatory usage for certain future astronomy settings (e.g. Rubin, and Roman)
(to be clear: in all of those cases, one can simply write to disk and then load that data, but that's pretty annoying and sometimes problematic from a storage perspective. In practice with code where I'm forced to do that, I regularly say "not worth it, I'll find some other package or roll my own")
For 1, streaming is currently not supported. It is an entirely different ticket/epic/whatever.
For 2 and 3, as Ricky pointed out, you can still stuff it into NDData. I am pretty sure it does not make copies but given @bmorris3 is also astropy.nddata maintainer, maybe he can confirm.
If you insist on native support for specific objects, please list them and point us to where they are documented. Thanks.
@pllim could jwst datamodels be an optional dependency and be recognize at load_data like roman data are (if roman datamodels are installed)? calling on @bmorris3 too here
For 1, streaming is currently not supported. It is an entirely different ticket/epic/whatever.
Sorry, miscommunication there. I mean something like:
>>> res = requests.get('http://something')
>>> image = PrimaryHDUOrWhatever.from_bytes(res.read())
>>> ... maybe do something with image ...
>>> imviz.load_data(image)
This is a CORE WORKFLOW for some major science cases. Of course you're right if there's an NDData adapter available that is already included, but I was trying to answer your question of what the science cases are and why you can't just use a file.
I thought the NDData had the problem of the missing header? or did i misunderstand?
The snippet in the docs indeed are missing the metadata. so you are understanding that right (although there's surely a way to do that, it's annoying that I have to do that with the datamodel object instead of it loading as straightforwardly as loading the file directly is)
(Oh, and I'm not insisting on specific objects to support, but rather trying to understand how to plug in support for specific objects - i.e., like how unified I/O is used in specutils to let lots of people make PRs to support various data objects.)
optional
Not pretty but I guess... still need to know what models to support though. There are many available in JWST, right?
https://github.com/spacetelescope/jdaviz/blob/1f57249e8daef6e12ef5ecb0dbc3a56b5ffe6662/jdaviz/configs/imviz/plugins/parsers.py#L22-L27
This is a CORE WORKFLOW
Yes, I understand and I agree. I was just saying that is already covered in a separate ticket somewhere (for Roman, I think).
missing header
That is a small doc update to fix. You can pass in meta= to set it. In ASDF, it should be accessible via .tree["meta"]. The Metadata would automatically flatten it.
https://github.com/spacetelescope/jdaviz/blob/1f57249e8daef6e12ef5ecb0dbc3a56b5ffe6662/jdaviz/configs/default/plugins/metadata_viewer/metadata_viewer.py#L134
plug in support for specific objects
As you pointed out, spectra are handled via specutils I/O. I don't think unified I/O for imaging should be in jdaviz and we don't really have imutils under Astropy, so I do not have a good answer if you are looking for generic custom I/O for arbitrary image formats. Unless I am forgetting something, @larrybradley ?
generic custom I/O for arbitrary image formats
that's a good point, but just to be clear I'm realizing I was being a bit confusing in my wording above so will clarify here:
What I'm thinking about is something akin to the unified I/O in that it's pluggable and external players might contribute their own if they want support, but it's different in that it's not about file I/O but rather sort of a "translation layer" for data objects. I.e., some short snippets that translate a jwst datamodel (with certain limited assumptions) into an NDData like what the snippet does.
Having said that it's debatable where that code best lives: in jwst, astropy.nddata, or jdaviz? I see your point that I/O doesn't belong in jdaviz, but I'm less sure about the "API translation" layer...
"API translation" layer
If we limit this request to ASDF models (instead of arbitrary format), I think it makes sense to have that layer in asdf-astropy. There is already a precedent with astropy.table.Table support, so why not astropy.nddata.NDData? https://asdf-astropy.readthedocs.io/en/latest/asdf-astropy/table.html
Ah, that might be a good direction to consider, @pllim . I think the key, though, is ensuring it carries enough contextual information over. I think I worked out the magic incantation to do what I need, and I'll make a PR shortly to fix the doc section about this.
Note, though, that does not close this, because I still think it's really annoying to do it that way vs the above, but I'm with you that this might in practice than be an upstream fix needed to do it.
Alright, I checked and at least for my science case the change I'm suggesting in #3279 makes an NDData object that seems to have no copying (copy=False by default in the NDData initializer). So I think that's a fine workaround for now. But I do think it would be helpful if I didn't have to explicitly make the NDData object myself every time given it's always the same for this specific data model. (But if I had to do something like imviz.load_data(mydatamodel.to_nddata()), that would be in practice as good as imviz.load_data(mydatamodel)!)
If you open a feature request for to_nddata upstream, please cross link here. Thanks!