MetPy icon indicating copy to clipboard operation
MetPy copied to clipboard

Make WPC parser more fault tolerant

Open dopplershift opened this issue 5 months ago • 3 comments

Description Of Changes

When parsing a part of the file fails, allow parsing to continue with the remaining parts.

Checklist

  • [x] Closes #3921
  • [x] Tests added

dopplershift avatar Sep 25 '25 20:09 dopplershift

FWIW, a 2000 example resulting in an IndexError CODSUS.txt

  File "/home/akrherz/projects/MetPy/src/metpy/io/text.py", line 140, in parse_wpc_surface_bulletin
    boundary = LineString(boundary) if len(boundary) > 1 else boundary[0]
                                                        ~~~~~~~~^^^
IndexError: list index out of range

akrherz avatar Sep 26 '25 02:09 akrherz

More FWIW, I have ~7_000 CODSUS products for 2024 and just one fails with current Metpy main branch :) CODSUS_2024.txt

  File "/home/akrherz/projects/MetPy/src/metpy/io/text.py", line 139, in parse_wpc_surface_bulletin
    boundary = [Point(_decode_coords(point)) for point in boundary]
                      ~~~~~~~~~~~~~~^^^^^^^
  File "/home/akrherz/projects/MetPy/src/metpy/io/text.py", line 59, in _decode_coords
    lat = float(f'{lat[:2]}.{lat[2:]}') * flip
          ~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not convert string to float: '.'

And here's a script to run them in bulk

import sys
import traceback
from io import BytesIO

import httpx
from metpy.io import parse_wpc_surface_bulletin
from tqdm import tqdm


def main(argv):
    """Go Main Go."""
    year = int(argv[1])
    resp = httpx.get(
        "https://mesonet.agron.iastate.edu/cgi-bin/afos/retrieve.py?"
        f"sdate={year}-01-01&edate={year + 1}-01-01&limit=9999"
        "&pil=CODSUS&fmt=text",
        timeout=60,
    )
    failure = 0
    progress = tqdm(resp.content.split(b"\003"))
    for prod in progress:
        progress.set_description(f"Failures: {failure}")
        bio = BytesIO(prod)
        try:
            parse_wpc_surface_bulletin(bio, year=year)
        except Exception:
            traceback.print_exc()
            with open(f"CODSUS_fail_{failure:04.0f}.txt", "wb") as fh:
                fh.write(prod)
            failure += 1

if __name__ == "__main__":
    main(sys.argv)

akrherz avatar Sep 26 '25 13:09 akrherz

Thanks @akrherz ! That's really helpful and we can definitely include a few more fixes here.

dopplershift avatar Sep 26 '25 16:09 dopplershift