cdsapi icon indicating copy to clipboard operation
cdsapi copied to clipboard

metadata information from cdsapi ?

Open perrette opened this issue 4 years ago • 2 comments

Hi and thanks for the very useful API,

As others pointed out, some more fine-grained documentation about is missing about accepted parameters and available data. Or course, the online Data documentation provides much of that missing information, but it does not match the cdsapi one to one because variable and other names (model, scenario...) are spelled differently in their full text and API form, so one has to go through the full interface to get the appropriate CDS API command. That works, but this could be further simplified in my opinion.

I was wondering is that could be built as a specific request, for each dataset? For instance, cdsapi.getchoices(dataset, field, **kwargs) such as:

cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'model')
cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'model', variable='2m_temperature')
cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'model', variable='2m_temperature', scenario='rcp_8_5')

These request aim at the model field in the projections-cmip5-monthly-single-levels dataset. More specifically, for each of the three lines:

  • all models in that dataset
  • all models in that dataset with the variable 2m_temperature
  • all models in that dataset with the variable 2m_temperature and the scenario rcp_8_5

Similarly, one could request available scenarios and variables:

cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'variable')
cdsapi.getchoices('projections-cmip5-monthly-single-levels', 'scenario')

(one could also have cdsapi.getfields(dataset, required=False) for a list of all fields associated with one dataset, or optionally a list of all required fields)

That would clearly come in handy in offline scripts.

perrette avatar May 12 '20 05:05 perrette

Another good API client that implements such functions: https://github.com/mwouts/world_bank_data (wb.get_indicators(), wb.get_countries() etc... before actual download wb.get_series(indicator=..., countries=...)). Very useful.

perrette avatar May 12 '20 05:05 perrette

I found what I was looking for: https://cp-availability.ceda.ac.uk

I suggest the cdsapi module could interface with that database somehow, to avoid having to enter the variables one by one, but this is less pressing now. Thanks.

EDIT: it seems that there is no 100% between variable names on that database and on the CDS API, which makes it difficult to use together with the CDS API

perrette avatar May 18 '20 12:05 perrette

Hi @perrette ,

Thanks for you comments and suggestions, and we agree completely but this is not possible with the current webAPI that underlines the python based cdsapi. We are currently modernising our systems and hope to include such functionality in the future. We hope that this will be available for public use in 2024, until then we will have to make do with using the web interface to gather the values for the requests.

Many thanks, Eddy

EddyCMWF avatar Feb 28 '23 09:02 EddyCMWF