Dataset Cleanup Organization
The following datasets need to be cleaned up:
Major Changes
- [ ] State Demographics: #99
- [ ] Elections: #90,
- [ ] Education: #89,
- [ ] Cancer: #87
- [x] Graduates: #81
- [ ] Health: #79
- [x] Drugs: #75
- [x] Labor: #63
- [x] Energy: #96
Minor Changes
- [ ] Weather: Seriously, there's a year in one of the precipitations.
- [x] Food: #88
- [x] School Scores: #85
- [x] Global Development: #84
- [x] Publishers: #83
- [x] Hydropower: #76
- [x] Real Estate: #78,
- [x] Real Estate #77
- [x] Construction Permits: #74
- [ ] Cars: #73
- [x] Cars: #72
- [ ] Art Institute Metadata: #71
- [x] Airlines: #70
- [ ] State Crime: #98
New Datasets
- [ ] Education Finance: #82
- [ ] Fashion: #92
- [ ] Medical Bribes: #101
For the weather dataset, would you like me to write a cleaning script on the data we currently have, or just clean the next chunk of weather reports?
Dr. Kafura weighed in on this, and I just realized you didn't get CCed in:
"I dont see this as a "big" issue but it does seem worthwhile to keep it up to date if the effort is reasonable. Alternatively, we could keep a bigger window - add the last three months to what we now have and have a six month view, eventually reaching a point where we have a year's worth of data and then keep that up to date on a rolling basis. Might be interesting to see weather across the year."
I agree with him - if we can ever get a year's worth of data, that would be an ideal situation. In the meantime, if it's not too much effort, update!
https://www.ncdc.noaa.gov/ulcd/ULCD
https://www.ncdc.noaa.gov/qclcd/QCLCD