datahub Unbundling aid update

Data series needed for unbundling aid update:

This involves update progress on tables that drive the Unbundling Aid section of the website

List of Tables (included in last year's issue)

[ ] oda_new
- [ ] Submitted for Review
[ ] oda_constant_new
- [ ] Submitted for Review
[ ] oda_constant
- [ ] Submitted for Review
[ ] oda
- [ ] Submitted for Review

Dimension series needed for production of fact.oda_constant_new:

[ ] Donors: dimension.oecd_donor_to_di_id_map
[ ] Recipients: dimension.oecd_recipient_to_di_id_map
[ ] Sectors: dimension.oecd_crs_sector_code_3_digit_to_itep_sector_web_id_map
[ ] Channels: dimension.oecd_crs_channel_code_5_digit_to_itep_channel_web_id_map

Other

[x] components of bundle: aid type, sector, finance type, TC marker, and key-word searches. Bundle coding stored here: https://github.com/devinit/ddw-r-scripts/blob/master/fact.R. Duncan to check this is all this consistent with latest OECD codelist.

Please edit issue if needed @Napho @akmiller01

Dec 06 '18 16:12 Duncan-Knox

Added to backlog for carry over into 2019.

Dec 11 '18 11:12 Duncan-Knox

Once null characters are removed from raw CRS, it can be read into R without error like so:

r = readBin(txt, raw(), file.info(txt)$size)
r[r==as.raw(0)] = as.raw(0x20) ## replace with 0x20 = <space>
writeBin(r, paste0("crs_cleanup/",basename(txt)) )
tmp = fread(paste0("crs_cleanup/",basename(txt)),sep="|")

After which, LATIN1 encoding is needed to import it into SQL.

DELIMITER ‘,’ ENCODING ‘LATIN1’ CSV HEADER;
COPY 3764422

Jan 10 '19 13:01 akmiller01