Herbie icon indicating copy to clipboard operation
Herbie copied to clipboard

FastHerbie error when reading ECMWF ensemble in xarray.

Open blaylockbk opened this issue 1 year ago • 0 comments

Discussed in https://github.com/blaylockbk/Herbie/discussions/116

Originally posted by csteele2 November 1, 2022 I was trying to use Herbie to easily download and process the european ensemble data. Not sure if I don't understand Herbie-fast or what, because Herbie fast appears to download the entire dataset, and seems like it takes way longer to do one timestep than my loop for 6 days. I have the sample of code I am using below. The fast herbie that is commented out took an hour for maybe one time step? Not sure, when it started another loop, I killed it, because my other loop takes 45 minutes, however, I have not been able to download a complete dataset in the three days I have been trying, for any cycle.

variable = "tp" #tp for precipitation
tp_all = []
valid_times = []
forecast_hours_qpf = range(3,147,3)
model_search_string = ":"+variable+":sfc:"
#forecast_hours_qpf = range(3,147,3)
#ptotal = fast_Herbie_xarray(DATES=model_run.strftime('%Y-%m-%d %H:00'), model="ecmwf", product="enfo", fxx=forecast_hours_qpf, max_threads=5, search_string=model_search_string)

for t in forecast_hours_qpf:
    H = Herbie(model_run.strftime('%Y-%m-%d %H:00'), model="ecmwf", product="enfo", fxx=t)
    tp = H.xarray(":"+variable+":sfc:")[0]
    #tp = tp.rename({"number":"pertubation"})
    tp_all.append(tp)
    valid_times.append(model_run + timedelta(hours=t))

ptotal = xr.concat([tp_all[i] for i in range(0,len(forecast_hours_qpf))], dim='step')

The most common problem is one or more timesteps will have the number (member/pertubation) coordinates as 0 instead of a an array of length 50. If I go back an assign those 50, it's clear something weird happened as revealed by this spot check of a single point (look at the 05-03Z column): image

I have not yet attempted to just download those times separately, but I would think this has to be a problem with the processing vs data, right? This is probably way more data than with a typical use-case, but I like me my ensemble data.

Other than these issues, kudos on this though, it makes dealing with this big data so so so so so so so much easier, and really helps elevate some serious science game.

blaylockbk avatar Nov 30 '22 05:11 blaylockbk