countries-states-cities-database icon indicating copy to clipboard operation
countries-states-cities-database copied to clipboard

[Bug]: Counties should not be returned separately with cities

Open NDCallahan opened this issue 5 months ago • 3 comments

Issue Type

Other

Location (if applicable)

City

What's wrong?

When retrieving the city information from the states, it appears the data also includes every county that the city is in as well. Instead of providing a clean list of cities, the data mixes in the county for the cities as well.

A county is to a state like a state is to the nation. They are just sub-divisions of the state with their own level of government that creates legislation and what not for a smaller area. They were created so that all state land is in a county.

What should it be?

If I am pulling back cities from states, I should not be also getting counties. That should be a separate endpoint in my honest opinion. This makes the data a bit messy when you are just looking for the cities that are in a state while not wanting to know which county that city sits in.

Hierarchy is Country --> State --> County --> City ( at least for the United States of America)

Source (optional)

https://learningenglish.voanews.com/a/city-district-county-town-what-are-the-differences-/5893801.html

NDCallahan avatar Nov 25 '25 16:11 NDCallahan

Hi @NDCallahan! I'm Dosu and I’m helping the countries-states-cities-database team.

For the United States, the data model links cities directly to states and does not explicitly include counties in city lists—cities are associated only with states in the JSON files and documentation. There is no field for counties in the city data, so you should only get cities when querying by state, not counties mixed in.

Mixing of administrative levels (like counties and cities) has been a known issue in other countries, and the repository has addressed this by introducing hierarchical fixes and using fields like type and parent_id to separate them. If you see counties appearing with cities in your results, it may be due to legacy data or incomplete separation, which has been recognized as a broader problem in the past and addressed for other regions (see issue #1019, Bangladesh fix, Taiwan fix, Estonia fix).

To ensure you only get cities, filter your results by state_id/state_code and, if present, ignore any entries with a type of "county" or similar. If you continue to see counties mixed in, it may be worth reviewing the data export or API response for unexpected types or legacy records. The repository has a track record of improving hierarchical accuracy, so future updates may further clarify these boundaries.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other

Chat with Dosu Open in Cursor Join Discord Share on X

dosubot[bot] avatar Nov 25 '25 16:11 dosubot[bot]

@dosu You can see in this single API call that the returned data is including both the city and the county.

Image

NDCallahan avatar Nov 25 '25 16:11 NDCallahan

You're right—counties are being returned alongside cities in the API response, which is a known data quality issue in this project and has affected other countries as well. The repository has addressed similar problems elsewhere by introducing hierarchical fields like type and parent_id to separate administrative levels, but for the US, the City model doesn't have a type field, so it's hard to filter out counties from cities directly in the response. The State model does have a type field, which can sometimes help with filtering if you cross-reference city state_id with state type, but this isn't always straightforward for API consumers [schema reference].

For now, you may need to filter out entries with known county names or use the state type field if available. The invalid longitude value you spotted (e.g., 191.30396000) is also a data issue—recent validation checks were supposed to ensure all coordinates are within valid ranges, so this may be a legacy or missed record [coordinate validation PR].

Recent pull requests have focused on fixing state and city naming/classification, which may help, but a full hierarchical fix for the US (like those done for Taiwan and Bangladesh) would be the best long-term solution [hierarchical fix example]. If you need clean city data now, client-side filtering and coordinate validation are your best options.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other

Chat with Dosu Open in Cursor Join Discord Share on X

dosubot[bot] avatar Nov 25 '25 16:11 dosubot[bot]