GerryChain icon indicating copy to clipboard operation
GerryChain copied to clipboard

Save a partition as an assignment csv (& load)

Open maxhully opened this issue 5 years ago • 3 comments

An assignment CSV has a column of node indices and a column with the corresponding district assignments. This CSV format seems to be common in the redistricting world. JSON probably isn't the right tool for the job since it only supports string keys.

maxhully avatar Mar 09 '19 12:03 maxhully

to_csv() is pretty straightforward. Something like this should be mostly what we want. It can be put anywhere, but the reference to self.parts would need to change.

In assignment.py:

...
import csv
...

class Assignment
...

def to_csv(filename=None):
    if filename == None:
        print("Please give a filename")
        return

    with open(filename,'w') as _outfile:
        _writer = csv.writer(_outfile)
        _writer.writerow(["unit_idx", "dist_no"]
        for row in sorted([(unit,dist) for dist in self.parts.keys() for unit in self.parts.keys()[unit]]:
            _writer.writerow(row)


from_csv() could look something like this. The following will take a csv and turn it into a dictionary of the format that Assignment.from_dict() wants: each key is a geographical unit and its value is the district it's assigned to. This can be a class method of Assignment or be put somewhere else, like in a utilities.py kind of file.

...

import csv
import collections

...

# checks if a string is really just an integer wearing a fun hat
def tryint(string):
   if not isinstance(string,str): return string
   return int(string) if string.isdigit() else string

def from_csv(filename, header = True, sep = ','):
    try:
        with open(filename, 'r') as _infile:
            _reader = csv.reader(_infile, delimiter = sep)
            # throw out the header
            if header: next(_reader)
            
            # dictionary comprehension over the rows
            new_assn = {tryint(row[0]), tryint(row[1]) for row in _reader}
            
        return new_assn

    except:
        print("Could not read {}".format(filename))
        return

This will return an {int:int} dictionary if possible, otherwise an {int:string}, {string:int}, or {string:string} dictionary, to be passed to Assignment.to_dict(). This will let people rename their districts and graph's nodes to GEOIDs or something human-readable if they want. The tryint() thing is because csvs are read in as strings always, so we need to do some sort of check to see if the names can be cast to integers.

There is currently no check that the graph's node's names align with the entries in the dictionary's values in Assignment.from_dict(), or that the values in this dictionary are unique, which may cause some safety issues if we're letting users bring in assignments from sources other than the .shp or .json containing the graph.

zschutzman avatar Mar 12 '19 16:03 zschutzman

You're right about the lack of a nodes check --- I ran into that issue a couple days ago. I'll make an issue.

maxhully avatar Mar 12 '19 17:03 maxhully

👍 for this -- ward/block assignment CSV is the only valid format for submitting proposals to the Wisconsin state legislature. Format details are here in the section "TECHNICAL SPECIFICATIONS (IF USING AN ALTERNATE TECHNOLOGY)".

I'm convening a redistricting workshop Sept 20th. The WI state legislature accepts maps up until Oct 15th.

Is there any chance the CSV export feature could be added before either of those times?

carlschroedl avatar Sep 13 '21 01:09 carlschroedl