Potential broken approach of reading csv files
The following pattern is used often to read csv files:
with open(filename, 'rU') as csvfile:
reader = unicodecsv.DictReader(csvfile)
I think this worked in python2 since the str and bytes types were synonymous. However, this breaks in python3 since unicodecsv expects the file to be opened in binary mode, which it is not.
For example, the following fails in python3 with the error AttributeError: 'str' object has no attribute 'decode'
import unicodecsv
filename = 'openelex/us/md/mappings/md.csv'
with open(filename, "r") as data:
reader = unicodecsv.DictReader(data)
for row in reader:
print(row)
Using csv instead of unicodecsv fixes the issue.
import csv
filename = 'openelex/us/md/mappings/md.csv'
with open(filename, "r") as data:
reader = csv.DictReader(data)
for row in reader:
print(row)
Is there something wrong with my setup, or is this broken for other people as well?
FYI, by using csv instead of unicodecsv together with one other small fix, I can get most of the failing tests in test_md_datasource.py to pass. However, I'm not sure if anything else breaks as a result. But given my understanding of how unicodecsv works with python2 vs. python3, it's a bit unclear to me how things are currently working.
This seems related to https://github.com/jdunck/python-unicodecsv/issues/65.
@warwickmm yeah, this is an artifact of using python2, but we should be using python3, so we can remove unicodecsv and just replace it with the csv module.
Ok. Do you mind my asking how any of this is working currently? It would seem to me that none of the csv files can be read properly as-is.
@warwickmm it's a fair question, and the basic answer is that we've mostly not used the core repo in recent times, instead prioritizing the data conversion work that results in the openelections-data-{state} repos. But we do use it for some of the states and use Python 3 for that.
Thanks. If the core repo isn't used very much anymore, is there a different repo that I can look at for possible ways to contribute? Or, is the core repo still deserving of attention?
@warwickmm most of our work now is done in various state-specific repos, where we put converted precinct results. For example, we're working on converting official precinct results for Texas here.
Thank you. I'll take a look at the state-specific repos.