cdsapi icon indicating copy to clipboard operation
cdsapi copied to clipboard

ERA5 reanalysis data retrieval is unexpectedly slow using CDS API (takes 45 minutes to download 2.1MB data)

Open Ted456 opened this issue 5 years ago • 4 comments

Basically, the size of data downloaded via CDS API is 2.1MB, but it takes 45 minutes. This is unusual because  I can download data of more than 2GB when using the ecmwfapi for that amount of time.

Below is my code for retrieving ERA5 data:

code start

import calendar import cdsapi server = cdsapi.Client()

def retrieve_era5(): """
A function to demonstrate how to iterate efficiently over several years and months etc
for a particular era5_request.
Change the variables below to adapt the iteration to your needs. You can use the variable 'target' to organise the requested data in files as you wish. In the example below the data are organised in files per month. (eg "era5_daily_201510.grb") """

yearStart = 1998
yearEnd = 1998
monthStart = 1
monthEnd = 1
for year in range(yearStart, yearEnd + 1):
    Year = str(year)
    for month in range(monthStart, monthEnd + 1):
        Month = str(month)
        # startDate = '%04d-%02d-%02d' % (year, month, 1)
        numberOfDays = calendar.monthrange(year, month)[1]
        Days = [str(x) for x in list(range(1, numberOfDays + 1))]
        # lastDate = '%04d-%02d-%02d' % (year, month, numberOfDays)
        target = "era5_1h_daily_0to70S_100Eto120W_025025_quv_%04d%02d.nc" % (year, month)
        # requestDates = (startDate + "/" + lastDate)
        era5_request(Year, Month, Days, target)

def era5_request(Year, Month, Days, target): """
An ERA era5 request for analysis pressure level data. Change the keywords below to adapt it to your needs. (eg to add or to remove levels, parameters, times etc) """ server.retrieve('reanalysis-era5-pressure-levels', {'product_type': 'reanalysis', 'format': 'netcdf', 'variable': ['specific_humidity', 'u_component_of_wind', 'v_component_of_wind'], 'year': Year, 'month': Month, 'day': Days, 'pressure_level': ['300', '350', '400','450', '500', '550', '600', '650', '700','750', '775', '800','825', '850', '875','900', '925', '950','975', '1000'], 'time': ['00:00', '01:00', '02:00','03:00', '04:00', '05:00','06:00', '07:00', '08:00','09:00', '10:00', '11:00','12:00', '13:00', '14:00','15:00', '16:00', '17:00','18:00', '19:00', '20:00','21:00', '22:00', '23:00'], 'area': [0, 100, -1, 101],}, target)

if name == 'main': retrieve_era5()

code end

This code is just to do things small at first, try to download specific_humidity, u_component_of_wind, v_component_of_wind from 1998-1-1 to 1998-1-31, temperatioal resolution: 1 hour; spatioanl resolution: 0.25° x 0.25°; pressure levels: 300 hpa to 1000 hpa. Area :1°S to 0, 100°E to 101°E.

Below is the picture showing the results of running this code: Capture Below is the picture showing that downloading data by ecmwfapi, basically, 23minutes retrieving 2.18GB data. I'm not sure what is going here. Capture1

Could any give me some advice? Many thanks.

Ted456 avatar Sep 04 '20 22:09 Ted456

Also facing the same issue. Anybody know a fix/workaround for this?

zoj613 avatar Jun 03 '22 13:06 zoj613

Same here. The problem comes from downloading along the time dimension. Unfortunately, the only workaround found is to use Google Earth Engine instead of the CDS API.

ealonsogzl avatar Jun 29 '22 08:06 ealonsogzl

Same here. The problem comes from downloading along the time dimension. Unfortunately, the only workaround found is to use Google Earth Engine instead of the CDS API.

You got an example of how to do this using Google Earth Engine?

zoj613 avatar Jun 29 '22 08:06 zoj613

Hi All, The CDS is currently suffering from some technical issues which are affect the speed of downloads, you can follow the progress here: https://confluence.ecmwf.int/pages/viewpage.action?pageId=278543465

EddyCMWF avatar Jun 29 '22 13:06 EddyCMWF