africa_poverty data preprocessing

data preprocessing

Open iokanyalcin opened this issue 3 years ago • 1 comments

Hello, I am trying to run the project but i have encountered several issues Especially in preprocessing part.

After finish all the steps downloading part and export the Earth Engine data to google cloud storage i go to process_tfrecords notebook the main issue here is my exported earth engine file names in format like this: {country_name}{year_range}.tfrecord.gz But in notebook process_tfrecords_dhs.ipynb name should be in this type : /lx_median{year_range}_{country}_dhslocs_ee_export.tfrecord.gz

I have change the name format and moved on but last part (Process TFRecords) none of the run processing functions are working i am getting error like:

list index out of range
There is no such file:

for instance angola's data stored as angola2011_xx.tfrecord.gz to angola2015_xx.tfrecord.gz in cloud storage. But notebook tries to find angola2009-11.tfrecord.gz

Cluster index not foud in tfds file: in REQUEIRED_KEYS list there is "cluster index" but some of my tfds files not inclues this.

I couldn't figure out where is the mistake or did i miss a step to create lx_median_{year_range}_{country}_dhslocs_ee_export.tfrecord.gz Can you please explain and help about this issue ? Thanks

Edit: I am inspecting the code most probably issues happens due to lacking of cluster indexes in tfrecord files. And maybe i should concatenate the tfrecords files.

May 07 '21 15:05 iokanyalcin

Hi, repo author here. I apologize for these data preprocessing issues, which are known. I am working on creating an updated data preprocessing pipeline. See the chrisyeh96/africa_poverty_clean repo for the latest preprocessing pipeline, which should resolve your issue.

Once chrisyeh96/africa_poverty_clean is fully ready, I will merge these two repos. Hopefully I will have time to do this over the next couple of months.

May 09 '21 08:05 chrisyeh96

africa_poverty africa_poverty copied to clipboard

data preprocessing

africa_poverty
africa_poverty copied to clipboard