jdaviz [Feature Request] let imviz load jwst data model objects directly

Jdaviz component

Imviz

Description

The documentation for Imviz.load_data implies that it should be possible to load asdf-in-fits files from jwst directly. However, when I try to do that (see "how to reproduce"), I get the error below.

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[59], line 1
----> 1 imviz.load_data(im)

File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/helper.py:174, in Imviz.load_data(self, data, data_label, show_in_viewer, **kwargs)
    172     if data_label:
    173         kwargs['data_label'] = data_label
--> 174     self.app.load_data(data, parser_reference='imviz-data-parser', **kwargs)
    176 # find the current label(s) - TODO: replace this by calling default label functionality
    177 # above instead of having to refind it
    178 applied_labels = []

File ~/REDACTED/python3.12/site-packages/jdaviz/app.py:851, in Application.load_data(self, file_obj, parser_reference, **kwargs)
    848         parser = data_parser_registry.members.get(data_parser)
    850 if parser is not None:
--> 851     parser(self, file_obj, **kwargs)
    852 else:
    853     self._application_handler.load_data(file_obj)

File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:149, in parse_data(app, file_obj, ext, data_label, parent, cache, local_path, timeout)
    147             _parse_image(app, pf, data_label, ext=ext, parent=parent)
    148 else:
--> 149     _parse_image(app, file_obj, data_label, ext=ext, parent=parent)

File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:227, in _parse_image(app, file_obj, data_label, ext, parent)
    225 if data_label is None:
    226     data_label = app.return_data_label(file_obj, ext, alt_name="image_data")
--> 227 data_iter = get_image_data_iterator(app, file_obj, data_label, ext=ext)
    229 # Save the SCI extension to this list:
    230 sci_ext = None

File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:217, in get_image_data_iterator(app, file_obj, data_label, ext)
    214     data_iter = _roman_asdf_2d_to_glue_data(file_obj, data_label, ext=ext)
    216 else:
--> 217     raise NotImplementedError(f'Imviz does not support {file_obj}')
    219 return data_iter

NotImplementedError: Imviz does not support <ImageModel(1994, 2404) from contents>

Note if I load using stdatamodels.asdf_in_fits instead of jwst.datamodels I get a different error but still an error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[61], line 1
----> 1 imviz.load_data(im)

File ~/casA_light_echos/.pixi/envs/default/lib/python3.12/site-packages/jdaviz/configs/imviz/helper.py:174, in Imviz.load_data(self, data, data_label, show_in_viewer, **kwargs)
    172     if data_label:
    173         kwargs['data_label'] = data_label
--> 174     self.app.load_data(data, parser_reference='imviz-data-parser', **kwargs)
    176 # find the current label(s) - TODO: replace this by calling default label functionality
    177 # above instead of having to refind it
    178 applied_labels = []

File ~/REDACTED/python3.12/site-packages/jdaviz/app.py:851, in Application.load_data(self, file_obj, parser_reference, **kwargs)
    848         parser = data_parser_registry.members.get(data_parser)
    850 if parser is not None:
--> 851     parser(self, file_obj, **kwargs)
    852 else:
    853     self._application_handler.load_data(file_obj)

File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:149, in parse_data(app, file_obj, ext, data_label, parent, cache, local_path, timeout)
    147             _parse_image(app, pf, data_label, ext=ext, parent=parent)
    148 else:
--> 149     _parse_image(app, file_obj, data_label, ext=ext, parent=parent)

File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:232, in _parse_image(app, file_obj, data_label, ext, parent)
    229 # Save the SCI extension to this list:
    230 sci_ext = None
--> 232 for data, data_label in data_iter:
    233 
    234     # if the science extension hasn't been identified yet, do so here:
    235     if sci_ext is None and data.ndim == 2 and ('[DATA' in data_label or '[SCI' in data_label):
    236         sci_ext = data_label

File ~/REDACTED/python3.12/site-packages/jdaviz/configs/imviz/plugins/parsers.py:440, in _roman_asdf_2d_to_glue_data(file_obj, data_label, ext)
    437     ext_list = (ext, )
    439 roman = file_obj.tree.get('roman')
--> 440 meta = roman.get('meta', {})
    441 coords = meta.get('wcs', None)
    443 for cur_ext in ext_list:

AttributeError: 'NoneType' object has no attribute 'get'

How to Reproduce

from jdaviz import Imviz
import jwst.datamodels

image = jwst.datamodels.open('https://archive.stsci.edu/hlsps/jwst-ero/hlsp_jwst-ero_jwst_miri_stephansquintet_f770w_v1_i2d.fits')

imviz = Imviz()
imviz.load_data(image)

can substitute stdatamodels.asdf_in_fits for jwst.datamodels and I still get an error

Expected behavior

The image should load without complaint.

Browser

Latest Stable Chrome

Jupyter

import sys; print("Python", sys.version) Python 3.12.7 | packaged by conda-forge | (main, Oct 4 2024, 16:05:46) [GCC 13.3.0] import numpy; print("Numpy", numpy.version) Numpy 2.1.2 import astropy; print("astropy", astropy.version) astropy 6.1.4 import matplotlib; print("matplotlib", matplotlib.version) matplotlib 3.9.2 import scipy; print("scipy", scipy.version) scipy 1.14.1 import skimage; print("scikit-image", skimage.version) scikit-image 0.24.0 import asdf; print("asdf", asdf.version) asdf 3.5.0 import stdatamodels; print("stdatamodels", stdatamodels.version) stdatamodels 1.10.1 import gwcs; print("gwcs", gwcs.version) gwcs 0.21.0 import regions; print("regions", regions.version) regions 0.10 import specutils; print("specutils", specutils.version) specutils 1.18.0 import specreduce; print("specreduce", specreduce.version) specreduce 1.4.1 import photutils; print("photutils", photutils.version) photutils 2.0.1 import astroquery; print("astroquery", astroquery.version) astroquery 0.4.7 import yaml; print("pyyaml", yaml.version) pyyaml 6.0.2 import asteval; print("asteval", asteval.version) asteval 1.0.5 import idna; print("idna", idna.version) idna 3.10 import traitlets; print("traitlets", traitlets.version) traitlets 5.14.3 import bqplot; print("bqplot", bqplot.version) bqplot 0.12.43 import bqplot_image_gl; print("bqplot-image-gl", bqplot_image_gl.version) bqplot-image-gl 1.5.0 import glue; print("glue-core", glue.version) glue-core 1.21.1 import glue_jupyter; print("glue-jupyter", glue_jupyter.version) glue-jupyter 0.23.0 import glue_astronomy; print("glue-astronomy", glue_astronomy.version) glue-astronomy 0.10.0 import echo; print("echo", echo.version) echo 0.9.0 import ipyvue; print("ipyvue", ipyvue.version) ipyvue 1.11.1 import ipyvuetify; print("ipyvuetify", ipyvuetify.version) ipyvuetify 1.10.0 import ipysplitpanes; print("ipysplitpanes", ipysplitpanes.version) ipysplitpanes 0.2.0 import ipygoldenlayout; print("ipygoldenlayout", ipygoldenlayout.version) ipygoldenlayout 0.4.0 import ipypopout; print("ipypopout", ipypopout.version) ipypopout 2.0.0 import jinja2; print("Jinja2", jinja2.version) Jinja2 3.1.4 import solara; print("solara", solara.version) solara 1.40.0 import vispy; print("vispy", vispy.version) vispy 0.14.3 import sidecar; print("sidecar", sidecar.version) sidecar 0.7.0 import jdaviz; print("Jdaviz", jdaviz.version) Jdaviz 4.0.0

Software versions

import platform; print(platform.platform()) import sys; print("Python", sys.version) import numpy; print("Numpy", numpy.version) import astropy; print("astropy", astropy.version) import matplotlib; print("matplotlib", matplotlib.version) import scipy; print("scipy", scipy.version) import skimage; print("scikit-image", skimage.version) import asdf; print("asdf", asdf.version) import stdatamodels; print("stdatamodels", stdatamodels.version) import gwcs; print("gwcs", gwcs.version) import regions; print("regions", regions.version) import specutils; print("specutils", specutils.version) import specreduce; print("specreduce", specreduce.version) import photutils; print("photutils", photutils.version) import astroquery; print("astroquery", astroquery.version) import yaml; print("pyyaml", yaml.version) import asteval; print("asteval", asteval.version) import idna; print("idna", idna.version) import traitlets; print("traitlets", traitlets.version) import bqplot; print("bqplot", bqplot.version) import bqplot_image_gl; print("bqplot-image-gl", bqplot_image_gl.version) import glue; print("glue-core", glue.version) import glue_jupyter; print("glue-jupyter", glue_jupyter.version) import glue_astronomy; print("glue-astronomy", glue_astronomy.version) import echo; print("echo", echo.version) import ipyvue; print("ipyvue", ipyvue.version) import ipyvuetify; print("ipyvuetify", ipyvuetify.version) import ipysplitpanes; print("ipysplitpanes", ipysplitpanes.version) import ipygoldenlayout; print("ipygoldenlayout", ipygoldenlayout.version) import ipypopout; print("ipypopout", ipypopout.version) import jinja2; print("Jinja2", jinja2.version) import solara; print("solara", solara.version) import vispy; print("vispy", vispy.version) import sidecar; print("sidecar", sidecar.version) import jdaviz; print("Jdaviz", jdaviz.version)

🐱

Oct 23 '24 21:10 eteq

@eteq , is still a problem on your end? I cannot reproduce the error. The image loads fine with jdaviz 4.1.dev (current main).

Screenshot 2024-11-01 172447

Nov 01 '24 21:11 pllim

@eteq you can load this file just fine, by passing the filename to imviz.load_data() rather than passing the opened datamodels object. Our documentation says that you need extra steps to load the actual datamodels object, as described here: https://jdaviz.readthedocs.io/en/stable/imviz/import_data.html#jwst-datamodels/. I just checked and that method does still work in this case.

Nov 04 '24 19:11 rosteen

First I just want to confirm: @pllim , when you say you got it to work, do you mean you did it the https://jdaviz.readthedocs.io/en/stable/imviz/import_data.html#jwst-datamodels way, or the way my snippet shows above?

@rosteen - thanks for the pointer, I didn't notice that snippet! That said, it has two major problem that makes it not work for my use case:

It copies the data into a new nddata, which is both a problem from a "what if I want to modify it" perspective and a "run out of memory" perspective. The fix is probably just one or two copy=False's, though, right?
The metadata isn't present, which in practice is a huge limitation when you load multiple files and want to figure out which is which in the UI

And one minor problem:

It's a lot of code for what should be one of the most basic user cases!

Is there a reason why a more complete version of that snippet (e.g. with the copy=False's in place and the metadata copied over) can't be implemented? I forget now if imviz uses unified-I/O or similar, but that would seem to be the way to do it...

Nov 11 '24 15:11 eteq

Yes sorry, I was only trying the ways we officially support, not the exact snippet using objects.

Nov 11 '24 15:11 pllim

I wouldn't call loading model objects into Imviz a basic use case. Most users would pass in the filename and be done with it.

Nov 11 '24 15:11 pllim

I disagree! There are many science use cases where the objects are the only available object. It is very not safe to assume a file is even present, and the regularity of which that's assumed is probably my single top "minor-to-moderate" annoyance when interacting with code as a scientist (not jdaviz specifically, which is much better than most, to be fair! I'm talking about the whole Python ecosystem).

At any rate, to be clear, I'm willing to spin up a PR myself to do this since I think I know the API-level interface of relevance. What I don't know is where the best place is to put more specific code like this - i.e., is there a unified I/O hook somewhere, or would I need to add a case to load_data? (and if the latter, is that an argument for trying to implement some sort of unified I/O or plugin approach?)

Nov 11 '24 15:11 eteq

There are many science use cases where the objects are the only available object. It is very not safe to assume a file is even present

Can you please clarify? Did someone send you a pickled object instead of the JWST data product?

Nov 11 '24 15:11 pllim

Checking for object at the load_data level is problematic because you need to import the object class at app level, which makes the package that holds the class a required dependency (in this case, it might be the entire pipeline). Unless you are willing to settle for static cls.__name__ checks only, which is pretty rigid because it does not account for subclasses or mixins.

And this is sounds like a feature request, not a bug.

Nov 11 '24 15:11 pllim

Ah, point taken this is a feature request. Will re-label!

Nov 11 '24 15:11 eteq

But still, how does a scientist only has access to an object and not the actual file?

Nov 11 '24 15:11 pllim

Can you please clarify? Did someone send you a pickled object instead of the JWST data product?

Some examples:

Data streamed from an online source such that you have an object in-memory but not on-disk
Data objects that you load in a notebook, then modify in some way before displaying (this is my use case here, which of course is possible with the various snippets, just in convenient)
Data objects that can be loaded from a file, but in a workflow one is exploring the data so needs to load them into python first, then do some exploration before knowing how/what you want to view. Then it's more natural to load the data object rather than the file by name (since otherwise you have to track the filename since some dataobjects, like jwst datamodels, don't carry an absolute path with them, which is error-prone and annoying).

In practice, the majority of my image use cases (probably of order 80%) fall under one of those three. Additionally, 1 will be much more common in a science platform setting, which will be mandatory usage for certain future astronomy settings (e.g. Rubin, and Roman)

Nov 11 '24 15:11 eteq

(to be clear: in all of those cases, one can simply write to disk and then load that data, but that's pretty annoying and sometimes problematic from a storage perspective. In practice with code where I'm forced to do that, I regularly say "not worth it, I'll find some other package or roll my own")

Nov 11 '24 15:11 eteq

For 1, streaming is currently not supported. It is an entirely different ticket/epic/whatever.

For 2 and 3, as Ricky pointed out, you can still stuff it into NDData. I am pretty sure it does not make copies but given @bmorris3 is also astropy.nddata maintainer, maybe he can confirm.

If you insist on native support for specific objects, please list them and point us to where they are documented. Thanks.

Nov 11 '24 15:11 pllim

@pllim could jwst datamodels be an optional dependency and be recognize at load_data like roman data are (if roman datamodels are installed)? calling on @bmorris3 too here

Nov 11 '24 18:11 camipacifici

For 1, streaming is currently not supported. It is an entirely different ticket/epic/whatever.

Sorry, miscommunication there. I mean something like:

>>> res = requests.get('http://something')
>>> image = PrimaryHDUOrWhatever.from_bytes(res.read())
>>> ... maybe do something with image ...
>>> imviz.load_data(image)

This is a CORE WORKFLOW for some major science cases. Of course you're right if there's an NDData adapter available that is already included, but I was trying to answer your question of what the science cases are and why you can't just use a file.

Nov 11 '24 18:11 eteq

I thought the NDData had the problem of the missing header? or did i misunderstand?

Nov 11 '24 18:11 camipacifici

The snippet in the docs indeed are missing the metadata. so you are understanding that right (although there's surely a way to do that, it's annoying that I have to do that with the datamodel object instead of it loading as straightforwardly as loading the file directly is)

Nov 11 '24 18:11 eteq

(Oh, and I'm not insisting on specific objects to support, but rather trying to understand how to plug in support for specific objects - i.e., like how unified I/O is used in specutils to let lots of people make PRs to support various data objects.)

Nov 11 '24 18:11 eteq

optional

Not pretty but I guess... still need to know what models to support though. There are many available in JWST, right?

https://github.com/spacetelescope/jdaviz/blob/1f57249e8daef6e12ef5ecb0dbc3a56b5ffe6662/jdaviz/configs/imviz/plugins/parsers.py#L22-L27

This is a CORE WORKFLOW

Yes, I understand and I agree. I was just saying that is already covered in a separate ticket somewhere (for Roman, I think).

missing header

That is a small doc update to fix. You can pass in meta= to set it. In ASDF, it should be accessible via .tree["meta"]. The Metadata would automatically flatten it.

https://github.com/spacetelescope/jdaviz/blob/1f57249e8daef6e12ef5ecb0dbc3a56b5ffe6662/jdaviz/configs/default/plugins/metadata_viewer/metadata_viewer.py#L134

plug in support for specific objects

As you pointed out, spectra are handled via specutils I/O. I don't think unified I/O for imaging should be in jdaviz and we don't really have imutils under Astropy, so I do not have a good answer if you are looking for generic custom I/O for arbitrary image formats. Unless I am forgetting something, @larrybradley ?

Nov 11 '24 19:11 pllim

generic custom I/O for arbitrary image formats

that's a good point, but just to be clear I'm realizing I was being a bit confusing in my wording above so will clarify here:

What I'm thinking about is something akin to the unified I/O in that it's pluggable and external players might contribute their own if they want support, but it's different in that it's not about file I/O but rather sort of a "translation layer" for data objects. I.e., some short snippets that translate a jwst datamodel (with certain limited assumptions) into an NDData like what the snippet does.

Having said that it's debatable where that code best lives: in jwst, astropy.nddata, or jdaviz? I see your point that I/O doesn't belong in jdaviz, but I'm less sure about the "API translation" layer...

Nov 11 '24 19:11 eteq

"API translation" layer

If we limit this request to ASDF models (instead of arbitrary format), I think it makes sense to have that layer in asdf-astropy. There is already a precedent with astropy.table.Table support, so why not astropy.nddata.NDData? https://asdf-astropy.readthedocs.io/en/latest/asdf-astropy/table.html

Nov 11 '24 19:11 pllim

Ah, that might be a good direction to consider, @pllim . I think the key, though, is ensuring it carries enough contextual information over. I think I worked out the magic incantation to do what I need, and I'll make a PR shortly to fix the doc section about this.

Note, though, that does not close this, because I still think it's really annoying to do it that way vs the above, but I'm with you that this might in practice than be an upstream fix needed to do it.

Nov 11 '24 19:11 eteq

Alright, I checked and at least for my science case the change I'm suggesting in #3279 makes an NDData object that seems to have no copying (copy=False by default in the NDData initializer). So I think that's a fine workaround for now. But I do think it would be helpful if I didn't have to explicitly make the NDData object myself every time given it's always the same for this specific data model. (But if I had to do something like imviz.load_data(mydatamodel.to_nddata()), that would be in practice as good as imviz.load_data(mydatamodel)!)

Nov 11 '24 20:11 eteq

If you open a feature request for to_nddata upstream, please cross link here. Thanks!

Nov 11 '24 20:11 pllim

jdaviz jdaviz copied to clipboard

[Feature Request] let imviz load jwst data model objects directly

Jdaviz component

Description

How to Reproduce

Expected behavior

Browser

Jupyter

Software versions

jdaviz
jdaviz copied to clipboard