MetPy
MetPy copied to clipboard
Make WPC parser more fault tolerant
Description Of Changes
When parsing a part of the file fails, allow parsing to continue with the remaining parts.
Checklist
- [x] Closes #3921
- [x] Tests added
FWIW, a 2000 example resulting in an IndexError
CODSUS.txt
File "/home/akrherz/projects/MetPy/src/metpy/io/text.py", line 140, in parse_wpc_surface_bulletin
boundary = LineString(boundary) if len(boundary) > 1 else boundary[0]
~~~~~~~~^^^
IndexError: list index out of range
More FWIW, I have ~7_000 CODSUS products for 2024 and just one fails with current Metpy main branch :) CODSUS_2024.txt
File "/home/akrherz/projects/MetPy/src/metpy/io/text.py", line 139, in parse_wpc_surface_bulletin
boundary = [Point(_decode_coords(point)) for point in boundary]
~~~~~~~~~~~~~~^^^^^^^
File "/home/akrherz/projects/MetPy/src/metpy/io/text.py", line 59, in _decode_coords
lat = float(f'{lat[:2]}.{lat[2:]}') * flip
~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: could not convert string to float: '.'
And here's a script to run them in bulk
import sys
import traceback
from io import BytesIO
import httpx
from metpy.io import parse_wpc_surface_bulletin
from tqdm import tqdm
def main(argv):
"""Go Main Go."""
year = int(argv[1])
resp = httpx.get(
"https://mesonet.agron.iastate.edu/cgi-bin/afos/retrieve.py?"
f"sdate={year}-01-01&edate={year + 1}-01-01&limit=9999"
"&pil=CODSUS&fmt=text",
timeout=60,
)
failure = 0
progress = tqdm(resp.content.split(b"\003"))
for prod in progress:
progress.set_description(f"Failures: {failure}")
bio = BytesIO(prod)
try:
parse_wpc_surface_bulletin(bio, year=year)
except Exception:
traceback.print_exc()
with open(f"CODSUS_fail_{failure:04.0f}.txt", "wb") as fh:
fh.write(prod)
failure += 1
if __name__ == "__main__":
main(sys.argv)
Thanks @akrherz ! That's really helpful and we can definitely include a few more fixes here.