astroquery
astroquery copied to clipboard
heasarc: Adding extra parsing for non-fits standard units
This is to make the module compatible with the more strict unit parsing in astropy 5.1.
See details and example code in #2333
I'm still running into an ERG related warning at writing out time:
WARNING: UnitsWarning: 'ERG/S/CM^2' did not parse as fits unit: At col 0, Unit 'ERG' not supported by the FITS standard. Did you mean EG, ER, Eg or erg? If this is meant to be a custom unit, define it with 'u.def_unit'. To have it recognized inside a file reader or other code, enable it with 'u.add_enabled_units'. For details, see https://docs.astropy.org/en/latest/units/combining_and_defining.html [astropy.units.core]
And unfortunately, it seems that some of the parsing gone very wrong, too, e.g. S
understandably parsed as u.S
rather than u.second
.
I'm copying the first row here for troubleshooting:
<Row index=0>
NAME RA DEC COUNT_RATE COUNT_RATE_ERROR FLUX FLUX_ERROR HARDNESS_RATIO ALT_NAME AP_EXPOSURE BII COUNTS COUNTS_ERROR ERROR_RADIUS EXPOSURE HARDNESS_RATIO_NEG_ERR HARDNESS_RATIO_POS_ERR HB_AP_EXPOSURE HB_COUNT_RATE HB_COUNT_RATE_ERROR HB_COUNTS HB_COUNTS_ERROR HB_EXPOSURE HB_FLUX HB_FLUX_ERROR HB_SNR LII SB_AP_EXPOSURE SB_COUNT_RATE SB_COUNT_RATE_ERROR SB_COUNTS SB_COUNTS_ERROR SB_EXPOSURE SB_FLUX SB_FLUX_ERROR SB_SNR SNR SOURCE_NUMBER SEARCH_OFFSET_
deg deg ct / S ct / S ERG/S/CM^2 ERG/S/CM^2 S deg ct ct arcsec S S ct / S ct / S ct ct S ERG/S/CM^2 ERG/S/CM^2 deg S ct / S ct / S ct ct S ERG/S/CM^2 ERG/S/CM^2
bytes21 float64 float64 float64 float64 float64 float64 float64 bytes15 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 float64 int64 bytes47
--------------------- ---------- -------- ---------- ---------------- ---------- ---------- -------------- --------------- ----------- --------- ------- ------------ ------------ -------- ---------------------- ---------------------- -------------- ------------- ------------------- --------- --------------- ----------- ---------- ------------- ------- ---------- -------------- ------------- ------------------- --------- --------------- ----------- ---------- ------------- ------- ------- ------------- -----------------------------------------------
CXOC J100036.9+022614 150.154020 2.437240 3.52e-04 4.96e-05 4.74e-15 6.69e-16 0.00 [ECV2009] 989 0 42.284067 0.0 0.0 1.09 162070 0.00 0.00 0 0.00e+00 2.72e-04 0.0 0.0 0 0.00e+00 7.13e-15 0.00 236.594835 0 3.28e-04 4.45e-05 0.0 0.0 161300 1.79e-15 2.43e-16 7.37 7.09 989 16.648 (150.01000622572633,2.1999950476067367)
I'm cc-ing @mhvk from the units side of the problem and @taldcroft as Tables and Chandra expert to see whether there are better approaches than add_enabled_aliases
for this problem. I'm certain similar issues may come up in other modules, after all hardly any data providers follow the fits standards.
add_enabled_aliases
was added in astropy 4.3, so we'll need a different approach for older version, as I don't think it's a serious enough reason to drop support just yet.
@bsipocz - I don't know of any better approaches. I can only apologize for the original catalog not being FITS compliant since I was involved in that project.
I am surprised that HEASARC does not validate input tables at the point of ingest since that would be the right place to make sure their archive is clean, but clearly that is water under the bridge at this point.
Hmm, another approach is to ditch the fits file format and switch to votable. The units come up properly in that case, so after all this might be a heasarc API issue during the process of converting the DB response to the desired output format, rather than an issue with the original data holding and ingestion.
That would not address the desire in #2333 to write out the table in fits format. 1) we either fix all the issues to support that or 2) advocate for astronomers to drop the usage of fits for catalogues in favour of votable.
Option 2 would certainly make life much much easier 😅
(Issues with saving as fits are:
- name columns are parsed as
object
dtype, so we need to convert them to str manually - description is too long, so we both get a complaint about the usage of HIERARCH cards, but then a ValueError, too for it's being too long
- multiple warnings about e.g. multiple slashes in units
- non-parsing units
cc @volodymyrss and @zoghbi-a to see whether they have any recommendations. Long term I suppose it would be nice to switch to a VO backend, but for now, a workaround PR like the one proposed would fix the annoyances that came yup in the fornax use case.
I think also switch to VOTable of heasarc is also not too complex: https://github.com/astropy/astroquery/pull/2353 The problem is only that fields get renamed, mostly capitalization changes, but not only. This can break things for the users. But maybe it's acceptable at some point.
This PR was a deadend and as the module is now being refactored, I'm closing it without merging. The issue for the bug remains open and we should check if it's still present when the refactoring is done.