readabs icon indicating copy to clipboard operation
readabs copied to clipboard

Closes #53

Open HughParsonage opened this issue 6 years ago • 8 comments

For some reason

tbl1 <- read_abs(cat_no = "6401.0", tables = 1)
tbl2 <- read_abs(cat_no = "6401.0", tables = 2)
identical(tbl1, tbl2)

regardless of whether I'm using the fst or not (the file downloaded seem to be the same?)

HughParsonage avatar Jan 16 '20 14:01 HughParsonage

Codecov Report

Merging #63 into master will decrease coverage by 4.84%. The diff coverage is 51.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #63      +/-   ##
==========================================
- Coverage   89.37%   84.52%   -4.85%     
==========================================
  Files          16       16              
  Lines         461      504      +43     
==========================================
+ Hits          412      426      +14     
- Misses         49       78      +29
Impacted Files Coverage Δ
R/read_abs_local.R 74.35% <100%> (ø) :arrow_up:
R/read_abs.R 63.3% <37.2%> (-13.84%) :arrow_down:
R/fst-utils.R 93.54% <87.5%> (-6.46%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update de17abf...12b02b1. Read the comment docs.

codecov[bot] avatar Jan 16 '20 14:01 codecov[bot]

For some reason

tbl1 <- read_abs(cat_no = "6401.0", tables = 1)
tbl2 <- read_abs(cat_no = "6401.0", tables = 2)
identical(tbl1, tbl2)

regardless of whether I'm using the fst or not (the file downloaded seem to be the same?)

for some weird reason, the first table in 6401.0 is called "Tables 1 and 2". See: https://www.abs.gov.au/AUSSTATS/[email protected]/DetailsPage/6401.0Sep%202019?OpenDocument

MattCowgill avatar Jan 17 '20 03:01 MattCowgill

tables should be a string (I think my current documentation on this gets it wrong). There are some ABS releases (eg. 6345.0: https://www.abs.gov.au/AUSSTATS/[email protected]/DetailsPage/6345.0Sep%202019?OpenDocument) that have table numbers like "2a", "2b" and so on.

MattCowgill avatar Jan 17 '20 03:01 MattCowgill

I accidentally marked this as 'ready for review' just now, sorry

MattCowgill avatar Jan 17 '20 09:01 MattCowgill

I think it should work with those but maybe uses the wrong logic. If there are failures let me know

On Fri, 17 Jan 2020 at 8:54 pm, Matt Cowgill [email protected] wrote:

I accidentally marked this as 'ready for review' just now, sorry

— You are receiving this because you authored the thread.

Reply to this email directly, view it on GitHub https://github.com/MattCowgill/readabs/pull/63?email_source=notifications&email_token=AB54MDE3RJ3KS424LG76H6TQ6F56RA5CNFSM4KHUVONKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJHESUI#issuecomment-575555921, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB54MDAKAKGGSICAZ3UZPADQ6F56RANCNFSM4KHUVONA .

HughParsonage avatar Jan 17 '20 10:01 HughParsonage

Hi @HughParsonage Sorry, somehow this PR escaped my attention for several months. I am contemplating whether check_local should be TRUE or FALSE by default. I am leaning towards FALSE so that by default, the function fetches the latest data.

MattCowgill avatar Apr 30 '20 03:04 MattCowgill

Ideally of course, it would be easy: check_local = NA with the logic that if the latest data is the same as the local copy, just use the local copy. But it seems that's basically impossible to do without downloading the data anyway?

HughParsonage avatar Apr 30 '20 04:04 HughParsonage

Yes -- I'm not sure what you had in mind for this step? Sorry, I was under the impression you had a plan for that...

The only two ways I can think of to verify if local data is up to date (or likely to be up to date) are:

  1. Infer likely latest release date from the local file. The local file tells us the periodicity of the data and the latest observation date.
  2. Look up the requested table in the ABS Time Series Database.

Option 1 is fast, but error-prone. Option 2 is slower (though faster than downloading the table(s) ) and requires internet connectivity.

MattCowgill avatar May 13 '20 09:05 MattCowgill