WIGI icon indicating copy to clipboard operation
WIGI copied to clipboard

Factor out code in IPython notebooks in separate files

Open hargup opened this issue 10 years ago • 7 comments

hargup avatar May 05 '15 12:05 hargup

@notconfusing can you brief me about how you have generated the snapshot_data. I should be able to write a script to generate them at regular intervals.

hargup avatar May 08 '15 11:05 hargup

@notconfusing some of the files you have used in the notebooks like helpers/world_cultures_shortcut.json and helpers/wiki_code_map.json are not present in this repository. Can you please add them?

hargup avatar May 08 '15 11:05 hargup

I'm creating a basic python package for WIGI at https://github.com/notconfusing/WIGI/tree/hargup/refactoring. My current approach is to move recurrent pieces of code to the package, and then that code from the package to reproduce the notebook. I would like to completely decouple data retrieval, data processing and data presentation.

hargup avatar May 10 '15 18:05 hargup

@hargup fantastic plan on decoupling all the seperate stages.

\me inhales deeply. OK, snapshot_data comes from this Java program. https://github.com/notconfusing/WIGI/blob/master/GenderIndexProcessor.java It's the thing we will have to run every week. In order to run it you need Wikidata Toolkit (WDTK). I want to get this happening on Wikimedia Labs because the ~2GB wikidata dump that it needs would be available over the local network rather than a big download. However if it helps you can just run WDTK locally for now.

BTW, When you say "package" do you mean making a "pip" package?

notconfusing avatar May 11 '15 23:05 notconfusing

Yes, when I say package I mean standalone software which can installed using pip or other package managers.

hargup avatar May 12 '15 06:05 hargup

As per #8, first focus on Gender by Culture, Gender by Country (World Map), Gender by Date of Birth, and Wikipedia Language by Gender.

notconfusing avatar Jun 22 '15 23:06 notconfusing

I've created on big python script which is gender-index-processing-standalone.py that makes the graphable csv's. So I'm not sure how this affects making a pip package, or refactoring. We don't really need the ipynb's except for demonstration purposes, so I'm going to move this to phase D.

notconfusing avatar Aug 04 '15 18:08 notconfusing