radis
radis copied to clipboard
Consider upstreaming exomol access to astroquery
🔖 Feature description
astroquery has a growing number of modules for querying line and molecular databases (maybe some more of them would be of use for you, too: https://astroquery.readthedocs.io/en/latest/#line-list-services), exposing more of them by upstreaming would be very much welcome.
I fully acknowledge that this would mean refactoring here, and probably a more involved PR than a simple copy to add it to astroquery, so I just open this issue as a reminder that we would very much welcome contributions like this (after all https://github.com/astropy/astroquery/issues/1242 is open for a while....)
👉 Why you want this feature!!
No response
Perfect timing. We are currently working with the ExoJax team on a common Radis/Exojax API https://github.com/radis/radis/pull/478 for Line database management (for line-by-line spectrum computations). A question that arised (https://github.com/HajimeKawahara/exojax/discussions/257#discussioncomment-2756469) was where to host the API eventually. We decided on a RADIS module ; short term ; but Astroquery may be better suited in the future.
@HajimeKawahara any thought ?
I think it may need to be both, e.g astroquery for the interaction with the upstream databases, but you can still build a common radis API on top of astroquery.
I'm pinging @keflavich, too as our current line lists are mostly focusing on services serving his science interest, but OTOH I would love to see them expanded to other ones as well as getting more usage and therefore real life testing for robustness.
(the benefit of starting with it in astroquery is that not much refactoring will be needed when it's moved upstream).
As @bsipocz said, the molecular line databases implemented in astroquery (CDMS, JPLSPEC, and Splatalogue) are mm-focused, but it would be great to have IR databases also included. I've made use of exomol in the past, but I had to download and mangle the files myself; I didn't come up with a consistent query interface. So I'm happy to support database-querying elements in astroquery.
We already have a working version, that handles ExoMol recommended databases (parsing the static-website to infer the recommended database ), and unzips it; parse it and returns a Pandas / Vaex dataframe example here. So it should be fairly easy to adapt it to Astroquery's architecture and query parameters. The only thing we absolutely need is to maintain a full Vaex approach (writing local file & returning the DataFrame) ; because ExoMol is a HUGE database and anything in-RAM ends up breaking memory.
Hosting the common API in astroquery might be one of the good ideas. But it looks like there are a couple of things to consider.
One is that it should be Vaex support as @erwanp already mentioned. Memory management is a real challenge to use ExoMol in real life. Another is the scope of a common API. I'm an astronomer, so I'm OK with it, but the common API should be open for other non-astronomy fields as well? In that case, hosting it on astroquery might cause problems (I don't know much about this).
I'm not really concerned about the 2nd, given that for instance Radis users already use Astropy.units although most of them come from the Plasma physics or Radiative Transfer communities. Bridges are good
Nice. Then, the only issue looks the Vaex support. I chatted with @ykawashima a bit today about this. The current plan is that ExoJAX receives the DataFrame structure from the common API and we will separate JAX support from the common API and move it to the ExoJAX interface.