api
                                
                                 api copied to clipboard
                                
                                    api copied to clipboard
                            
                            
                            
                        Region not returned in results for Algeria (DZ) if using multiple layers
I am doing a Pelias build for Algeria, in Algeria, the same name is used for the region and the main city in the region. For example, the region Jijel has a city also named Jijel (the biggest city in the region). When using the pelias build, if I search for a region, only the city (locality) named the same will appear in the result and the region will be absent, but if I specify to use only the region layer, then the result will be correct, and I will have the region. Specifying other layers will remove the region from the results. I also reproduced the same behavior in the mapzen online demo, so I don't think it has to do with my installation.
Steps to Reproduce
- Use autocomplete or search endpoint and use Jijel as text
- Specify the layers as region and observe the region Jijel appearing
- Now specify the layers as region,locality or simply remove the layers query param
- The locality Jijel will appear but the region Jijel will not
Expected behavior
- Have Pelias return the region and the locality if they are named the same
Environment (please complete the following information):
- OS: WSL2 (Ubuntu) on Windows 11
- Using docker desktop
- docker version 20.10.16, build aa7e414
- docker-compose version 1.29.2, build 5becea4c
Pastebin/Screenshots
- using all layers with the url:
http://localhost:4000/v1/autocomplete?text=Jijel&size=20
{
	"geocoding": {
		"version": "0.2",
		"attribution": "http://localhost:4000/attribution",
		"query": {
			"text": "Jijel",
			"parser": "pelias",
			"parsed_text": {
				"subject": "Jijel",
				"locality": "Jijel"
			},
			"size": 20,
			"layers": [
				"venue",
				"street",
				"locality",
				"neighbourhood",
				"county",
				"region",
				"localadmin",
				"country"
			],
			"private": false,
			"lang": {
				"name": "English",
				"iso6391": "en",
				"iso6393": "eng",
				"via": "default",
				"defaulted": true
			},
			"querySize": 40
		},
		"warnings": [
			"performance optimization: excluding 'address' layer"
		],
		"engine": {
			"name": "Pelias",
			"author": "Mapzen",
			"version": "1.0"
		},
		"timestamp": 1658781626094
	},
	"type": "FeatureCollection",
	"features": [
		{
			"type": "Feature",
			"geometry": {
				"type": "Point",
				"coordinates": [
					5.766004,
					36.821997
				]
			},
			"properties": {
				"id": "1141906181",
				"gid": "whosonfirst:locality:1141906181",
				"layer": "locality",
				"source": "whosonfirst",
				"source_id": "1141906181",
				"country_code": "DZ",
				"name": "Jijel",
				"accuracy": "centroid",
				"country": "Algeria",
				"country_gid": "whosonfirst:country:85632451",
				"country_a": "DZA",
				"region": "Jijel",
				"region_gid": "whosonfirst:region:85670723",
				"region_a": "JJ",
				"locality": "Jijel",
				"locality_gid": "whosonfirst:locality:1141906181",
				"label": "Jijel, Algeria",
				"addendum": {
					"concordances": {
						"wd:id": "Q402726"
					}
				}
			},
			"bbox": [
				5.74600356014,
				36.8019970335,
				5.78600356014,
				36.8419970335
			]
		},
		... bunch of osm venues but no region...
	],
	"bbox": [
		4.614564,
		36.542653,
		5.895862,
		36.8419970335
	]
}
- if we specify the layer region with the url:
http://localhost:4000/v1/autocomplete?text=Jijel&layers=region&size=20
{
	"geocoding": {
		"version": "0.2",
		"attribution": "http://localhost:4000/attribution",
		"query": {
			"text": "Jijel",
			"parser": "pelias",
			"parsed_text": {
				"subject": "Jijel",
				"locality": "Jijel"
			},
			"size": 20,
			"layers": [
				"region"
			],
			"private": false,
			"lang": {
				"name": "English",
				"iso6391": "en",
				"iso6393": "eng",
				"via": "default",
				"defaulted": true
			},
			"querySize": 40
		},
		"engine": {
			"name": "Pelias",
			"author": "Mapzen",
			"version": "1.0"
		},
		"timestamp": 1658781918933
	},
	"type": "FeatureCollection",
	"features": [
		{
			"type": "Feature",
			"geometry": {
				"type": "Point",
				"coordinates": [
					5.961268,
					36.713719
				]
			},
			"properties": {
				"id": "85670723",
				"gid": "whosonfirst:region:85670723",
				"layer": "region",
				"source": "whosonfirst",
				"source_id": "85670723",
				"country_code": "DZ",
				"name": "Jijel",
				"accuracy": "centroid",
				"country": "Algeria",
				"country_gid": "whosonfirst:country:85632451",
				"country_a": "DZA",
				"region": "Jijel",
				"region_gid": "whosonfirst:region:85670723",
				"region_a": "JJ",
				"label": "Jijel, Algeria",
				"addendum": {
					"concordances": {
						"fips:code": "AG24",
						"gn:id": 2492910,
						"gp:id": 2344596,
						"hasc:id": "DZ.JJ",
						"iso:id": "DZ-18",
						"qs_pg:id": 1134892,
						"unlc:id": "DZ-18",
						"wd:id": "Q235718"
					}
				}
			},
			"bbox": [
				5.427349,
				36.523972,
				6.478499,
				36.935315
			]
		}
	],
	"bbox": [
		5.427349,
		36.523972,
		6.478499,
		36.935315
	]
}
- I also tried to use only wof in the sources query param, and still the same results
Hi @YanDjin, this behavior is intentional. It's fairly common around the world (ie. NY county, city, state), Luxemburg, Singapore etc.
We've had numerous issued filed over the years about insufficient deduplication in these cases, most end users expect a similar experience as the fortune 100 mapping providers, which also 'squash' similarly named entries like we do.
There is a ruleset in the code which defines which layer gets priority, it's usually the locality, as that's usually what the user is searching for.
Hi @missinglink, thanks for the reply, and sorry for the late reply, Thank you for your insight, I modified the ruleset for my use case. I will close this PR.