OECD icon indicating copy to clipboard operation
OECD copied to clipboard

Regional & Metropolitan vars

Open antaldaniel opened this issue 6 years ago • 3 comments

I was wondering if you could include in the vignette an example of filtering regional or metropolitian variables? For this, neither strategy seems to work. The SDMX querry is several pages long and I cannot really work with it. The other filtering method for me does not seem to work. However, without effective filtering the data table exceeds the API limit.

Is there a way to access, for example, any of the regional GDP variables with the package?

antaldaniel avatar Oct 25 '19 18:10 antaldaniel

You probably have to first get the regional IDs and then just do a bit of string interpolation. The example below retrieves disposable income (on PPP basis) for all Swedish regions in 2015 and 2016.

Let me know if this still doesn't solve your issue.

library(tidyverse)
library(OECD)

struc <- get_data_structure("REGION_ECONOM")
se_regions <- struc$REG_ID %>% 
  filter(str_detect(id, "SE")) %>% 
  pull(id) %>% 
  paste0(collapse = "+")

query <- sprintf("2.%s.SNA_2008.INCOME_DISP.PC_REAL_PPP.ALL.2015+2016", se_regions)

df <- get_dataset("REGION_ECONOM", query)

inner_join(df, struc$REG_ID, by = c("REG_ID" = "id")) %>% 
  set_names(tolower) %>% 
  select(reg_id, label, time, obsvalue)
# A tibble: 16 x 4
   reg_id label                time  obsvalue
   <chr>  <chr>                <chr>    <dbl>
 1 SE31   North Middle Sweden  2015     19768
 2 SE22   South Sweden         2015     20671
 3 SE23   West Sweden          2015     21164
 4 SE21   Småland with Islands 2015     20024
 5 SE32   Central Norrland     2015     19947
 6 SE11   Stockholm            2015     23947
 7 SE12   East Middle Sweden   2015     20230
 8 SE33   Upper Norrland       2015     20080
 9 SE32   Central Norrland     2016     20180
10 SE11   Stockholm            2016     24606
11 SE31   North Middle Sweden  2016     20043
12 SE33   Upper Norrland       2016     20367
13 SE12   East Middle Sweden   2016     20577
14 SE21   Småland with Islands 2016     20503
15 SE22   South Sweden         2016     21021
16 SE23   West Sweden          2016     21766

expersso avatar Oct 28 '19 13:10 expersso

Thank you very much. I was thinking about some link to standard SDMX codes, but I think your approach is a good hack. However,

all_regions <- struc$REG_ID %>% 
  pull(id) %>% 
  paste0(collapse = "+")

query <- sprintf("2.%s.SNA_2008.INCOME_DISP.PC_REAL_PPP.ALL.2015+2016", 
                 all_regions)

df <- get_dataset("REGION_ECONOM", query)

will lead to

Error in rsdmx::readSDMX(url) : HTTP request failed with status: 414 Request-URI Too Long

I was looking for something like "ALL_IDs", or "ALL.REGIONS" or similar shortcode on sdmx.org.

Obviously, your approach is a great help, just may take a lot of time to loop it through all countries, or country pairs.

I wanted to download a panel of regional data, but the SDMX code exporter gives URIs that are even too long for RStudio to handle, not to mention the API.

antaldaniel avatar Oct 30 '19 11:10 antaldaniel

You could also try to leave the dimension empty in the query, e.g.

df <- get_dataset("PRICES_CPI", "AUS.CPALTT01..M")

gives all measures (the wildcarded dimension).

expersso avatar Oct 30 '19 12:10 expersso