MetPy icon indicating copy to clipboard operation
MetPy copied to clipboard

GempakSounding.snxarray fails to return list of soundings

Open wx4stg opened this issue 1 year ago • 1 comments

What went wrong?

Reading this sounding file: https://github.com/wx4stg/LRC/blob/main/data/240713.ua with GempakSounding and attempting to call snxarray() on it gives an error

Operating System

Linux

Version

1.6.2

Python Version

3.12

Code to Reproduce

from metpy.io import GempakSounding
gs = GempakSounding('240713.ua')
gs.snxarray()

Errors, Traceback, and Logs

Cell In[21], line 1
----> 1 gs.snxarray()

File ~/micromamba/envs/LRC/lib/python3.12/site-packages/metpy/io/gempak.py:2196, in GempakSounding.snxarray(self, station_id, station_number, date_time, state, country)
   2192     raise KeyError('No stations were matched with given parameters.')
   2194 sndno = [(s.DTNO, s.SNDNO) for s in matched]
-> 2196 data = self._unpack_merged(sndno) if self.merged else self._unpack_unmerged(sndno)
   2198 soundings = []
   2199 for snd in data:

File ~/micromamba/envs/LRC/lib/python3.12/site-packages/metpy/io/gempak.py:1554, in GempakSounding._unpack_unmerged(self, sndno)
   1549             for iprm, param in enumerate(parameters['name']):
   1550                 sounding[part.name][param] = (
   1551                     np.array(packed_buffer[iprm::nparms], dtype=np.float32)
   1552                 )
-> 1554     soundings.append(self._merge_sounding(sounding))
   1555 return soundings

File ~/micromamba/envs/LRC/lib/python3.12/site-packages/metpy/io/gempak.py:2066, in GempakSounding._merge_sounding(self, parts)
   2063     _interp_logp_pressure(merged, self.prod_desc.missing_float)
   2065 # Interpolate missing data
-> 2066 _interp_logp_data(merged, self.prod_desc.missing_float)
   2068 # Add below ground MAN data
   2069 if merged['PRES'][0] != self.prod_desc.missing_float and bgl > 0:

File ~/micromamba/envs/LRC/lib/python3.12/site-packages/metpy/io/gempak.py:241, in _interp_logp_data(sounding, missing)
    239         bdata[param] = val[iabove]
    240 vlev = sounding['PRES'][i]
--> 241 outdata = _interp_parameters(vlev, adata, bdata, missing)
    242 sounding[var1][i] = outdata[var1]
    243 if var2 is not None:

File ~/micromamba/envs/LRC/lib/python3.12/site-packages/metpy/io/gempak.py:458, in _interp_parameters(vlev, adata, bdata, missing)
    452 between = (((pres1 < pres2) and (pres1 < vlev)
    453            and (vlev < pres2))
    454            or ((pres2 < pres1) and (pres2 < vlev)
    455            and (vlev < pres1)))
    457 if not between:
--> 458     raise ValueError('Current pressure does not fall between levels.')
    459 elif pres1 <= 0 or pres2 <= 0:
    460     raise ValueError('Pressure cannot be negative.')

ValueError: Current pressure does not fall between levels.

wx4stg avatar Jul 14 '24 16:07 wx4stg

I'll debug this later...

wx4stg avatar Jul 14 '24 16:07 wx4stg

It took some time and effort, but I believe I know what is happening. snxarray() fails on the first sounding it finds with an issue. Since these are observed soundings, it is the sounding merging process that fails. The particular sounding in this case is the 12Z sonde from the station NZWP (Whenuapai, New Zealand). Running the file through GEMPAK snlist and processing the data for that time/location, things work just fine. At first I thought there might be some issues with the interpolation that occurs, but further investigation revealed the issue stemmed from the how merging of the WMO encoded groups was implemented.

This particular sounding had a PPCC (Mandatory winds above 100 mb) group, but did not have a TTCC (mandatory data above 100 mb) group. The current implementation of the _merge_sounding() method will insert the missing levels from the PPCC group (the same is done for the PPAA group). GEMPAK, on the other hand, will only merge data from the PPAA/PPCC groups if the levels in those groups already exist within the sounding data structure (coming from the TTAA/TTCC groups, respectively). Some upper air stations only report mandatory wind data. In that case, GEMPAK does not attempt to do interpolation on those levels without temperature/dewpoint data.

The result of all this was that you have missing height data where you would otherwise not (i.e., interp_moist_height() does not have temp/dewpoint data on a few MAN levels). Interpolation then fails due to using bisect_left() with the missing data (-9999) and you get the wrong index for insertion. Whether the use of bisect_left() has a small part to play in the bug is not entirely known. I suspect that it is just not extremely common to find upper air data with a PPCC group but not a TTCC group. Ultimately, adapting the MAN wind merging code to behave like GEMPAK does allows the file to process properly. That would suggest that the usage of bisect_left() works fine as long as the portions of the merging code behave as they should.

In my testing so far, I can read the problem file attached to this issue and all tests still pass. I can draw up a PR to fix this issue fairly quickly. The only question I have before moving forward with that is whether or not to implement any error handling that would warn users of a problematic station within a file but still allow processing of other data. My vote would be to keep error handling as is and just fix as needed. Let me know if you think otherwise.

nawendt avatar Apr 08 '25 19:04 nawendt