bomrang icon indicating copy to clipboard operation
bomrang copied to clipboard

Monthly products and/or ACORN-SAT for get_historical()?

Open jimjam-slam opened this issue 6 years ago • 11 comments

I'm writing an app at the moment that involves pulling a combination of:

  1. raw daily temp and precip station data (products IDCJAC00[10, 11]),
  2. raw daily precip data (IDCJAC0009)
  3. raw monthly temp data (IDCJAC000[2, 4, 5–8]) and
  4. ACORN-SAT temperature data

get_historical() covers cases (1) and (2) but not (3) or (4). My understanding is that implementing case (3) would be similar to (1) and (2) but would use a different ancillary file to determine the c parameter in the URL (not just substituting the NCC code for each product).

Case (4), on the other hand, would be fairly cruisy: the URL is just http://www.bom.gov.au/climate/change/hqsites/data/temp/[tmax/tmin].[station_id].daily.csv. But it might be more appropriate to offer it as an entirely separate function to get_historical().

I'm not sure whether you're interested in implementing either of these cases in bomrang, but since I need to work on this anyway I thought I'd mention it! Happy to plug away at it myself and either do a PR or just document it here for future consideration 😄 If you can offer any insight into how the ancillary file helps fill the additional c parameter in, I'd also appreciate it a lot!

jimjam-slam avatar May 22 '19 04:05 jimjam-slam

Hey James,

Good to hear from you. Nice suggestions.

I can see case (3) being a nice addition to get_historical() say with an extra parameter period = c("daily", "monthly") or similar.

Agree case (4) sounds like its own thing.

I'm not 100% sure, but I think the determination for that URL parameter is covered in the source at: https://github.com/ropensci/bomrang/blob/5c32b89e9939a550c25aca4dda9b89a41e1c3f12/R/get_historical.R#L293

#' BOM data is available via URL endpoints but the arguments are not (well) #' documented. This function first obtains an auxilliary data file for the given #' station/measurement type which contains the remaining value p_c. It then #' constructs the approriate resource URL.

deanmarchiori avatar May 22 '19 05:05 deanmarchiori

I'm with @deanmarchiori on both items, @jonocarroll, what say ye about the modification to get_historical()?

It all sounds like a nice addition. We're happy to have PRs here. That's basically how this package came into being.

adamhsparks avatar May 22 '19 05:05 adamhsparks

Sounds good to me. I don't quite follow why the monthly data needs to be scraped vs summarised from the daily data, but if there's a URL for it then I'm happy to have it processed consistently. Is the monthly data available for all the codes (rain, solar, etc...)?

The p_c sleuthing is undoing whatever intentional/accidental obfuscation the BOM hides their URLs behind, but it requires the secondary file.

jonocarroll avatar May 22 '19 06:05 jonocarroll

Thanks, everyone!

Yeah, I'm pretty comfortable manually aggregating daily data to monthly myself—in fact, where we use ACORN-SAT here we have to do that anyway—but since the products are available and our team prefers to use existing BOM products where possible, that's what I'm doing 😅

One obstacle could be that .get_nnc() currently retrieves the list of stations available for each known NCC code at: http://www.bom.gov.au/climate/data/lists_by_element/alphaAUS_[NCC].txt.

https://github.com/ropensci/bomrang/blob/5c32b89e9939a550c25aca4dda9b89a41e1c3f12/R/get_historical.R#L209

Those lists are accessible at http://www.bom.gov.au/climate/cdo/about/sitedata.shtml, but the monthly options don't appear to be there on that page or at the expected URLs.

Further, the ancillary file containing the value of p_c for the existing products (retrieved in .get_zip_url()) doesn't appear to work for the monthly ones. For example, for Olympic park (086338), here's daily tmin (NCC: 123):

http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_stn_num=086338&p_display_type=availableYears&p_nccObsCode=123

086338||,2013:-1490879938,2014:-1490879938,2015:-1490879938,2016:-1490879938,2017:-1490879938,2018:-1490879938,2019:-1490879938

But here's highest monthly tmin (NCC 42):

http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_stn_num=086338&p_display_type=availableYears&p_nccObsCode=42

086338||

The URLs for the monthly data clearly operate using the same parameter, though, so it might just be a matter of figuring out how the ancillary file URLs need to differ:

http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?
 p_display_type=monthlyZippedDataFile&
  p_stn_num=086338&
  p_c=-1490870144&
  p_nccObsCode=40&
  p_startYear=

jimjam-slam avatar May 22 '19 06:05 jimjam-slam

I should add that I haven't tried to use monthly rainfall or solar, but it looks like the NCC codes are:

  • Monthly rainfall (probably total): 139
  • Monthly mean solar exposure: 203

(Can't find any mention of monthly extremes for these two, though.)

jimjam-slam avatar May 22 '19 07:05 jimjam-slam

~~Unfortunately, the JavaScript used on the monthly HTML pages also appears to scrape p_c from the HTML.~~ And it isn't the p_c for the monthly product, it's the p_c for the corresponding daily one.

The only thing I can think of from here is if av? can take a value for p_display_type other than availableYears that can provide p_c 😕

EDIT: polling a few stations (040764, 023090, 086338), it seems like different products have values of p_c that vary by a fixed, or nearly fixed, amount:

  • monthly highest tmax - monthly mean tmax = 243
  • daily tmax - monthly highest tmax = 10 628 or 10 627

This is probably a rabbit hole for me to go down, but it seems doubtful that p_c is a hash of some kind.

(Please let me know if my thinking out loud isn't welcome in this thread!)

jimjam-slam avatar May 24 '19 02:05 jimjam-slam

I'll also have a play if I get the chance. Could you please link a few daily and monthly data pages for me to test?

jonocarroll avatar May 24 '19 05:05 jonocarroll

Product here is not the same as product here. That has been my contribution to this so far :joy_cat: :croissant:

softloud avatar May 24 '19 05:05 softloud

Hi - sorry lurking on this for a while. 

Last year I did some scraping on the ACORN-SAT data while I was learning Shiny and arguing with a climate change denier. I've got a notebook here for it 

http://rpubs.com/benmoretti/434904

and a Shiny dashboard for the aggregated data

https://benmoretti.shinyapps.io/ACORN_SAT_stations_data/

There might be some useful data parsing code in there 

Cheers

Ben

On 24 May 2019 at 2:36 pm, Jonathan Carroll [email protected] wrote:

I'll also have a play if I get the chance. Could you please link a few daily and monthly data pages for me to test?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.

ghost avatar May 24 '19 05:05 ghost

Thanks @jonocarroll! Here're some pages:

Station Product Product NCC code Page link Download link
040764 Daily maximum temperature 122 http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_nccObsCode=122&p_display_type=dailyDataFile&p_startYear=&p_c=&p_stn_num=040764 http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_display_type=dailyZippedDataFile&p_stn_num=040764&p_c=-332371462&p_nccObsCode=122&p_startYear=2019
040764 Daily minimum temperature 123 http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_nccObsCode=123&p_display_type=dailyDataFile&p_startYear=&p_c=&p_stn_num=040764 http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_display_type=dailyZippedDataFile&p_stn_num=040764&p_c=-332371658&p_nccObsCode=123&p_startYear=2019
040764 Monthly mean maximum temperature 36 http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_nccObsCode=36&p_display_type=dataFile&p_startYear=&p_c=&p_stn_num=040764 http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_display_type=monthlyZippedDataFile&p_stn_num=040764&p_c=-332360591&p_nccObsCode=36&p_startYear=
040764 Monthly lowest temperature (lowest tmin) 43 http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_nccObsCode=43&p_display_type=dataFile&p_startYear=&p_c=&p_stn_num=040764 http://www.bom.gov.au/jsp/ncc/cdio/weatherData/av?p_display_type=monthlyZippedDataFile&p_stn_num=040764&p_c=-332361034&p_nccObsCode=43&p_startYear=

jimjam-slam avatar May 24 '19 06:05 jimjam-slam

Thanks very much, @benmoretti! As a repeat ACORN-SAT user, I'm very grateful that those URLs are a lot less ambigious 😁

jimjam-slam avatar May 24 '19 06:05 jimjam-slam

From the README

This package has been archived due to BOM's ongoing unwillingness to allow programmatic access to their data and actively blocking any attempts made using this package or other similar efforts.

maelle avatar Jan 10 '23 08:01 maelle