datahub icon indicating copy to clipboard operation
datahub copied to clipboard

Unbundling aid update

Open Duncan-Knox opened this issue 7 years ago • 2 comments

Data series needed for unbundling aid update:

This involves update progress on tables that drive the Unbundling Aid section of the website

List of Tables (included in last year's issue)

  • [ ] oda_new
    • [ ] Submitted for Review
  • [ ] oda_constant_new
    • [ ] Submitted for Review
  • [ ] oda_constant
    • [ ] Submitted for Review
  • [ ] oda
    • [ ] Submitted for Review

Dimension series needed for production of fact.oda_constant_new:

  • [ ] Donors: dimension.oecd_donor_to_di_id_map
  • [ ] Recipients: dimension.oecd_recipient_to_di_id_map
  • [ ] Sectors: dimension.oecd_crs_sector_code_3_digit_to_itep_sector_web_id_map
  • [ ] Channels: dimension.oecd_crs_channel_code_5_digit_to_itep_channel_web_id_map

Other

  • [x] components of bundle: aid type, sector, finance type, TC marker, and key-word searches. Bundle coding stored here: https://github.com/devinit/ddw-r-scripts/blob/master/fact.R. Duncan to check this is all this consistent with latest OECD codelist.

Please edit issue if needed @Napho @akmiller01

Duncan-Knox avatar Dec 06 '18 16:12 Duncan-Knox

Added to backlog for carry over into 2019.

Duncan-Knox avatar Dec 11 '18 11:12 Duncan-Knox

Once null characters are removed from raw CRS, it can be read into R without error like so:

r = readBin(txt, raw(), file.info(txt)$size)
r[r==as.raw(0)] = as.raw(0x20) ## replace with 0x20 = <space>
writeBin(r, paste0("crs_cleanup/",basename(txt)) )
tmp = fread(paste0("crs_cleanup/",basename(txt)),sep="|")

After which, LATIN1 encoding is needed to import it into SQL.

DELIMITER ‘,’ ENCODING ‘LATIN1’ CSV HEADER;
COPY 3764422

akmiller01 avatar Jan 10 '19 13:01 akmiller01