django-cities PostalCode city FK

Hi,

Are you sure you FK is enough? For example in Germany, cities with same postal code, can exist. I think 2m2 will be better fit. Thanks.

https://github.com/coderholic/django-cities/blob/master/cities/models.py#L181

Oct 27 '16 09:10 caullla

Can you give me an example of two cities with the same postal code?

And assuming that's true, then yes, a M2M relationship would be a better fit. That's not difficult to code, but that would be...involved...to transition all of our users to. Is that something you could look into creating a PR for?

Oct 27 '16 12:10 blag

I made inspection, and found cities which are present as separate instances in City model and have same postal_code.

[<City: Westerland>, <City: Sylt-Ost>] # postal_code = 25980
[<City: Klausdorf>, <City: Kiel>] # postal_code = 24147

But now I'm not sure about m2m. Yea, postal_code the same, but those PostalCode records isn't equal for 100%. And until PostalCode.code is not unique in database schema, FK could be enough. So we can have duplicated code with different location or district_name.

What do you think about?

Oct 27 '16 13:10 caullla

I would rather have M2M models between the appropriate models than duplicate data. How far up the hierarchy do we need to switch to M2M (or, to ask it a different way: are those two cities in different Subregions or Regions?

If we really wanted to go wild with it, we would basically model the entire hierarchy as a graph instead of a tree, and then model the entire history of all places (eg: cities in the Ottoman Empire or Austro-Hungary, Berlin belonging to both East and West Germany).

All of that is possible, the real issue is the cities import command. That just gets more convoluted the more data model changes we make. 😖

Oct 28 '16 00:10 blag

m2m is not a silver bullet. I would say, best fit looks like this

City <- PostalCode -> Code

class City(models.Model):
     postal_codes = models.ManyToMany(Code, through=PostalCode)

One more question/option. Do we have a polygons for region, subregion, city. We can use geometry instead of graph. Make sense?

Oct 28 '16 07:10 caullla

What data goes in the PostalCode model and what data goes in the Code model?

Right now we import points for cities, districts, and postal codes.

Polygons exist. They are distributed in per-country zip files from Geonames, and we do nothing with them because they would presumably take forever to import and few users have requested them. There was an issue (#125) where somebody was going to add importing polygons.

Oct 28 '16 08:10 blag

Code will have only code(currently in PostalCode.code). That's why I don't see reason to add m2m. Right now PostalCode can have duplicated value in code field, but in same time values of other fields isn't equal.

If we find a way to import polygons, we can use them to fill up any relations. Using polygons we can query all PostalCode in Region or all PostalCode in SubRegion. We also can use geometry on import, I don't know how you build relations right now, but with geometry you don't need to check anything, except Point in Polygon. You can build, City -> Region, City-> Country, PostalCode-> City, etc.. And import order doesn't matter.

ps. Assuming that Point is in the middle of Polygon(Point if far enough of Polygon border), it should works even if geometry have not 100% accuracy.

Oct 28 '16 08:10 caullla

Using polygons we can query all PostalCode in Region or all PostalCode in SubRegion

I suspect that would be incredibly slow but I'm willing to try it. Could you code up a PR that implements that?

Oct 28 '16 09:10 blag

I'll start with polygon import.

Oct 28 '16 09:10 caullla

django-cities django-cities copied to clipboard

PostalCode city FK

django-cities
django-cities copied to clipboard