openair
openair copied to clipboard
Is it possible to get daily averages of each London borough as a whole? As opposed to each individual borough site.
Firstly, amazing package so thank you!
I'm trying to get familiar with Openair and was wondering if the following is possible:
For my project, I want to get daily air pollution data for each London borough as a whole, ultimately creating one dataset that has all air pollution over a number of years from each London borough (e.g. Barking and Dagenham, Bromley, etc).
However, I have noticed that there are several site locations that belong to each borough. e.g. BG1 | Barking and Dagenham - Rush Green | Suburban
BG2 | Barking and Dagenham - Scrattons Farm | Suburban
BG3 | Barking and Dagenham - North Street | Kerbside
I have tried the following to get an average for one borough, assuming site codes that start with 'BGx' belong to Barking and Dagenham. By doing so, I have got the results I wanted but will need to add a new variable 'Borough' and name all entries 'Barking and Dagenham' later.
'''library(openair) BG <- importKCL(site = c('bg1','bg2','bg3'), year = (2015:2021)) BG = timeAverage(BG, avg.time = 'day') BG'''
The issue is that there are around 30+ London boroughs and I don't think this is the most efficient way of doing this for every single one, especially having to try and identify what 'sites' fall within a borough, and then combining them all to make one dataset where each entry can be identified as their respective borough.
Does the Openair package have a more efficient method that I may have overlooked? Is this something that could be implemented? Or is there a better way utilising R code?
Thanks in advance!
It depends on your data. Based on your example I see that the boroughs are present in the name as borough - streetname
. You can create a new column in the dataframe with the boroughs when removing the "- streetname'.
BG$borough <- sub("-.*", "", BG$xxx) #with xxx being the correct variable
With that column in your dataframe you can use timeAverage
using the type = "borough"
.
Note that you can have a look at importMeta
, which includes borough name and other information. You can link this meta data to the AQ data using a join (probably left_join
AQ data and meta data using date
, site
and code
as the join variables. Here's how to get the meta data (5th column is the borough name):
info <- importMeta(source = "kcl", all = TRUE)
> info
# A tibble: 1,003 × 12
code site Address la_id Authority site_type os_grid_x os_grid_y latitude longitude OpeningDate
<chr> <chr> <chr> <int> <chr> <chr> <int> <int> <dbl> <dbl> <dttm>
1 EF1 Eastbo… Holly Pla… 206 Eastbourne Urban Ba… 560155 103154 50.8 0.272 2019-01-15 00:00:00
2 NL1 Brent … Brent Cro… 571 Brent Cro… Industri… 0 0 49.8 -7.56 2019-04-25 01:00:00
3 NL2 Brent … Brent Cro… 571 Brent Cro… Industri… 0 0 49.8 -7.56 2019-04-25 01:00:00
4 NL3 Brent … Brent Cro… 571 Brent Cro… Industri… 0 0 49.8 -7.56 2019-04-25 01:00:00
5 HP0 Honor … King's Co… 651 DEFRA Par… Urban Ba… 536473 174128 51.4 -0.0374 2018-11-27 00:00:00
6 SKC Southw… South Cir… 28 Southwark Roadside 533698 173268 51.4 -0.0777 2021-07-22 01:00:00
7 MS3 Maryle… Cabin opp… 650 DEFRA Bla… Kerbside 528125 182016 51.5 -0.155 1997-08-12 10:00:00
8 SK8 Southw… Tower Bri… 28 Southwark Roadside 533488 179804 51.5 -0.0782 2019-06-01 01:00:00
9 ES3 Eastle… The Point… 180 Eastleigh Roadside 445310 119148 51.0 -1.36 2019-01-01 00:00:00
10 OP1 Honor … King's Co… 651 DEFRA Par… Urban Ba… 536473 174128 51.4 -0.0374 2019-07-04 01:00:00
# … with 993 more rows, and 1 more variable: ClosingDate <dttm>
>
Note that you can have a look at
importMeta
, which includes borough name and other information. You can link this meta data to the AQ data using a join (probablyleft_join
AQ data and meta data usingdate
,site
andcode
as the join variables. Here's how to get the meta data (5th column is the borough name):info <- importMeta(source = "kcl", all = TRUE) > info # A tibble: 1,003 × 12 code site Address la_id Authority site_type os_grid_x os_grid_y latitude longitude OpeningDate <chr> <chr> <chr> <int> <chr> <chr> <int> <int> <dbl> <dbl> <dttm> 1 EF1 Eastbo… Holly Pla… 206 Eastbourne Urban Ba… 560155 103154 50.8 0.272 2019-01-15 00:00:00 2 NL1 Brent … Brent Cro… 571 Brent Cro… Industri… 0 0 49.8 -7.56 2019-04-25 01:00:00 3 NL2 Brent … Brent Cro… 571 Brent Cro… Industri… 0 0 49.8 -7.56 2019-04-25 01:00:00 4 NL3 Brent … Brent Cro… 571 Brent Cro… Industri… 0 0 49.8 -7.56 2019-04-25 01:00:00 5 HP0 Honor … King's Co… 651 DEFRA Par… Urban Ba… 536473 174128 51.4 -0.0374 2018-11-27 00:00:00 6 SKC Southw… South Cir… 28 Southwark Roadside 533698 173268 51.4 -0.0777 2021-07-22 01:00:00 7 MS3 Maryle… Cabin opp… 650 DEFRA Bla… Kerbside 528125 182016 51.5 -0.155 1997-08-12 10:00:00 8 SK8 Southw… Tower Bri… 28 Southwark Roadside 533488 179804 51.5 -0.0782 2019-06-01 01:00:00 9 ES3 Eastle… The Point… 180 Eastleigh Roadside 445310 119148 51.0 -1.36 2019-01-01 00:00:00 10 OP1 Honor … King's Co… 651 DEFRA Par… Urban Ba… 536473 174128 51.4 -0.0374 2019-07-04 01:00:00 # … with 993 more rows, and 1 more variable: ClosingDate <dttm> >
Thanks for this. When using df1 = importKCL(site = ...) is there a way to select all sites? instead of having to list each one?
All sites can be selected through using importMeta()
to obtain all of the KCL metadata, and then putting the "code" column straight into importKCL()
.
# list codes
kcl_codes <- openair::importMeta("kcl")$code
# import
importKCL(kcl_codes)