datahub
datahub copied to clipboard
Unbundling aid update
Data series needed for unbundling aid update:
This involves update progress on tables that drive the Unbundling Aid section of the website
List of Tables (included in last year's issue)
- [ ] oda_new
- [ ] Submitted for Review
- [ ] oda_constant_new
- [ ] Submitted for Review
- [ ] oda_constant
- [ ] Submitted for Review
- [ ] oda
- [ ] Submitted for Review
Dimension series needed for production of fact.oda_constant_new:
- [ ] Donors: dimension.oecd_donor_to_di_id_map
- [ ] Recipients: dimension.oecd_recipient_to_di_id_map
- [ ] Sectors: dimension.oecd_crs_sector_code_3_digit_to_itep_sector_web_id_map
- [ ] Channels: dimension.oecd_crs_channel_code_5_digit_to_itep_channel_web_id_map
Other
- [x] components of bundle: aid type, sector, finance type, TC marker, and key-word searches. Bundle coding stored here: https://github.com/devinit/ddw-r-scripts/blob/master/fact.R. Duncan to check this is all this consistent with latest OECD codelist.
Please edit issue if needed @Napho @akmiller01
Added to backlog for carry over into 2019.
Once null characters are removed from raw CRS, it can be read into R without error like so:
r = readBin(txt, raw(), file.info(txt)$size)
r[r==as.raw(0)] = as.raw(0x20) ## replace with 0x20 = <space>
writeBin(r, paste0("crs_cleanup/",basename(txt)) )
tmp = fread(paste0("crs_cleanup/",basename(txt)),sep="|")
After which, LATIN1 encoding is needed to import it into SQL.
DELIMITER ‘,’ ENCODING ‘LATIN1’ CSV HEADER;
COPY 3764422